To me, this is the first time Wayland feels like it's not a waste of time. The display server does not need to have the complexity of window managing on top the surface management. I certainly share the author's sentiment:
> Although, I do not know for sure why the original Wayland authors chose to combine the window manager and Wayland compositor, I assume it was simply the path of least resistance.
Although I'm not sure if it was the least resistance per se (as a social phenomenon), but just that it's an easier problem to tackle. Or maybe the authors means the same thing.
(That and the remote access story needs to be fixed. It just works in X11. Last time I tried it with a system that had 90 degree display orientation, my input was 90 degrees off from the real one. Now, this is of course just a bug, but I have a strong feeling that the way architecture Wayland has been built makes these kind of bugs much easier to create than in X11.)
I'm currently using a fully vibe-coded, personal River window manager that works just how I want it to. I switched to it after I realized I couldn't do everything I wanted in Hyprland (e.g. tile windows to equal areas instead of BSP by default).
Simple example of how impactful this separation has been for me.
The fact that Wayland can't just substitute out pluggable WMs without changing a bunch of other unrelated infrastructure is IMO one of the biggest user-facing losses relative to X11. Anybody who is working to improve that is doing god's work as they say.
Not only a loss but a key disabler. Having used to having the same customized window manager for decades it's impossible to change to Wayland until there's a fully equivalent interface for managing windows so that everything works as I want from mouse clicks to keyboard shortcuts. Maybe it could be an existing window manager adding support for River, or Wayback layer that reimplements an X11 desktop root on top of a minimal Wayland compositor, but none of the current Wayland compositors even scratch the surface of this.
We need a compositor that exposes everything as an extension. Preferably in a hot-reloadable, tweakable way, say, using Lua (with JIT). And also exposing its APIs in a way that allows having an analog of xdotool.
It's a damper on development of new WMs and DEs, too. I have ideas for my own desktop I'd like to explore at some point, and if I do it'll almost certainly be X11 based initially because it's so much more quick and easy to wrap one's head around and get the iteration loop up and running with.
I'm not anti-Wayland and I think X11 has enough issues that it's worth transitioning over to something better but this is a critical weakness in Wayland's design.
That's not the same thing. It's way easier to write an X11 window manager than to write a Wayland compositor, even with something like wlroots, because the window manager can speak the same protocol that clients speak, and it runs as a separate process.
As a concrete example, Emacs' EXWM package works by implementing an X11 client library in Emacs Lisp, then using it to talk to the X server (which is a separate process, so this works fine) and telling it how to position windows.
Whereas on Wayland, this is not possible without re-implementing a standalone compositor process, because otherwise architecturally it doesn't work. Emacs can't both do the drawing and be drawn.
EWM implements a Wayland compositor as a native thread spawned by a dynamic module in Emacs, it's a full compositor within the Emacs process: https://codeberg.org/ezemtsov/ewm
So it is architecturally possible (but infeasible in plain Emacs Lisp).
This one could technically be written in plain Emacs Lisp, but I'm happy to use something that already has all the XML codegen stuff for Wayland figured out. Dynamic modules work pretty well, fwiw.
No, that still requires you to make the whole thing, you just get help. For instance, I've run into a problem where I try some great new compositor that uses wlroots, and even though wlroots has good support for keyboard layouts I can't actually set the layout because the compositor hasn't wired up that functionality.
Especially with LLMs, the cost here is down significantly. People also drastically over-idealize what making an X window manager entailed: sure X had it's compositor, but you had to build so so much yourself.
I'm glad River is trying to create a bigger base here; this is way cool. And it sort of proves the value of Wayland: someone can just go do that. Someone can just make a generic compositor/display-server now, with their own new architecture and plugin system, and it'll just work with existing apps.
We were so locked in to such a narrow limited system, with it's own parallel abstraction layer to what the kernel now offers (that didn't exist when X was created). It's amazing that we have a chance for innovation and improvement now. The kernel as a stable base of the pyramid, wlroots/sway as a next layer up, and now River as a higher layer still for folks to experiment and create with. This could not be going better, and there's so much more freedom and possibility; this is such a great engine for iteration and improvement.
"Please don't comment on whether someone read an article. "Did you even read the article? It mentions that" can be shortened to "The article mentions that.""
The Wayland standard does not prescribe it (unlike X), and the reference implementations were monolithic for a very long time.
Wayland in general had a rather cavalier approach to doing away with things that X users take for granted, like, well, making screenshots. Eventually, under pressure, those in charge agreed that these features are actually very important for real users, so implementations appeared. It's an understandable way to discover the minimal usable subset of features, but the process of it is a bit frustrating for the early adopters.
I've never used a system with Wayland (been on i3 for ~15 years) but every time a project like this comes up, I have to wonder why Wayland is even a thing. So many hoops to jump through for things that should be simple.
Sure, X11 has warts but I can make it do basically anything I want. Wayland seems like it will always have too much friction to ever consider switching.
There's a type of input called "DeviceEvent" which is a bit lower level than "Window event". It also occurs even if the window isn't "active".
Windows and X11 support this, but Wayland doesn't except for mouse movement. I noticed my program stopped working on Linux after I updated it. Ended up switching to Window Events, but still kind of irritating.
It runs just fine at 165 hz for me. Given that xrandr and CRTs have been around for a while, and both have supported high refresh rates for a long while, something seems fishy here. Something is probably at fault, but it's not X11.
Separating the compositor and window manager feels like one of those ideas that seems obvious in hindsight, but the protocol/state-machine design here shows why it took real work to make it practical.
Lowering the barrier for writing Wayland window managers without forcing everyone to build a full compositor seems like a big win.
Insightful article. I don't recall ever viewing an easy-to-follow lesson, tutorial or book for that matter that clearly explained the various components of a Linux Desktop environment. Always had to follow complicated and obscure guides to do this and that, when solving issues, but seldom did any explain their functions clearly.
i'm a little thrown, because the Wayland diagram doesn't feel quite right. the compositor does lie between the kernel and the apps, but IIRC the apps have their own graphics buffers from the kernel that they are drawing into directly. the compositor then composites them together. to me, that feels more like the kernel is at the center of the diagram here: the wayland compositor is between the kernel and the output / input.
i don't think it has a huge impact on the discussion here. but this is such a key difference versus X, that i think is hugely under-told: Wayland compositors all rely on lots of kernel facilities to do the job, where-as X is basically it's own kernel, has origins where it effectively was the device driver for the gpu, talking to it over pci, and doing just about everything. when people contrast wayland versus X as wayland compositors needing to do so much, i can't help but chuckle, because it feels like the kernel does >50% of what X used to have to do itself; it's a much simpler world, using the kernel's built-in abstractions, rather than being multiple stacked layers of abstractions (kernels + X's own).
it means that the task of writing the display-server / compositor is much much much simpler. it's still hard! but the kernel is helping so much. there's an assumed base of having working GPU drivers!
author appears to super know their stuff. alas the FOSDEM video they link to is not loading for me. :(
one major question, since this is a protocol, how viable is it to decompose the window management tasks? rather than have a monolithic window manager, does this facilitate multiple different programs working together to run a desktop? not entirely sure the use case, but a more pluggable desktop would be interesting!
>i don't think it has a huge impact on the discussion here. but this is such a key difference versus X, that i think is hugely under-told: Wayland compositors all rely on lots of kernel facilities to do the job, where-as X is basically it's own kernel, has origins where it effectively was the device driver for the gpu, talking to it over pci, and doing just about everything. when people contrast wayland versus X as wayland compositors needing to do so much, i can't help but chuckle, because it feels like the kernel does >50% of what X used to have to do itself; it's a much simpler world, using the kernel's built-in abstractions, rather than being multiple stacked layers of abstractions (kernels + X's own).
Are you an AI bot? Modern X11 server using DRM are more than 20 years old. You are talking about how X11 servers worked in the 90's
Yes exactly. DRM exists, but there's still what I called the X "kernel", all of it's heavyweight abstractions.
To the previous a-hole, frak you: not an AI. That's rude as frak. Also, you manage to be incredibly wrong. Even an AI wouldn't overlook such an obvious error; maybe it'd be better to have it replace you. So rude dude! Behave!
To me, this is the first time Wayland feels like it's not a waste of time. The display server does not need to have the complexity of window managing on top the surface management. I certainly share the author's sentiment:
> Although, I do not know for sure why the original Wayland authors chose to combine the window manager and Wayland compositor, I assume it was simply the path of least resistance.
Although I'm not sure if it was the least resistance per se (as a social phenomenon), but just that it's an easier problem to tackle. Or maybe the authors means the same thing.
(That and the remote access story needs to be fixed. It just works in X11. Last time I tried it with a system that had 90 degree display orientation, my input was 90 degrees off from the real one. Now, this is of course just a bug, but I have a strong feeling that the way architecture Wayland has been built makes these kind of bugs much easier to create than in X11.)
I'm currently using a fully vibe-coded, personal River window manager that works just how I want it to. I switched to it after I realized I couldn't do everything I wanted in Hyprland (e.g. tile windows to equal areas instead of BSP by default).
Simple example of how impactful this separation has been for me.
BSP?
Binary space partitioning
The fact that Wayland can't just substitute out pluggable WMs without changing a bunch of other unrelated infrastructure is IMO one of the biggest user-facing losses relative to X11. Anybody who is working to improve that is doing god's work as they say.
Not only a loss but a key disabler. Having used to having the same customized window manager for decades it's impossible to change to Wayland until there's a fully equivalent interface for managing windows so that everything works as I want from mouse clicks to keyboard shortcuts. Maybe it could be an existing window manager adding support for River, or Wayback layer that reimplements an X11 desktop root on top of a minimal Wayland compositor, but none of the current Wayland compositors even scratch the surface of this.
You only need a single implementation that exposes an API for running a WM as an extension.
I don't really get why would it be a good idea to somehow mandate a specific architecture design from the standard.
We need a compositor that exposes everything as an extension. Preferably in a hot-reloadable, tweakable way, say, using Lua (with JIT). And also exposing its APIs in a way that allows having an analog of xdotool.
It's a damper on development of new WMs and DEs, too. I have ideas for my own desktop I'd like to explore at some point, and if I do it'll almost certainly be X11 based initially because it's so much more quick and easy to wrap one's head around and get the iteration loop up and running with.
I'm not anti-Wayland and I think X11 has enough issues that it's worth transitioning over to something better but this is a critical weakness in Wayland's design.
How is a WM not just a simple plugin/extension? Find a display server you like and write an extension for it!
That would suffice if I were only looking to build a WM, but my goal is a full (lean) DE.
[flagged]
Yours? Because I know that mine wasn't.
You can do that already with libraries such as wlroots or Smithay
That's not the same thing. It's way easier to write an X11 window manager than to write a Wayland compositor, even with something like wlroots, because the window manager can speak the same protocol that clients speak, and it runs as a separate process.
As a concrete example, Emacs' EXWM package works by implementing an X11 client library in Emacs Lisp, then using it to talk to the X server (which is a separate process, so this works fine) and telling it how to position windows.
Whereas on Wayland, this is not possible without re-implementing a standalone compositor process, because otherwise architecturally it doesn't work. Emacs can't both do the drawing and be drawn.
EWM implements a Wayland compositor as a native thread spawned by a dynamic module in Emacs, it's a full compositor within the Emacs process: https://codeberg.org/ezemtsov/ewm
So it is architecturally possible (but infeasible in plain Emacs Lisp).
For river (the thing this article is about) I wrote an Emacs WM, but also opted for a dynamic module for the Wayland protocol parts: https://code.tvl.fyi/tree/tools/emacs-pkgs/reka
This one could technically be written in plain Emacs Lisp, but I'm happy to use something that already has all the XML codegen stuff for Wayland figured out. Dynamic modules work pretty well, fwiw.
No, that still requires you to make the whole thing, you just get help. For instance, I've run into a problem where I try some great new compositor that uses wlroots, and even though wlroots has good support for keyboard layouts I can't actually set the layout because the compositor hasn't wired up that functionality.
The article already addresses that...
It's not easy and the major compositors (Gnome, KDE) are NOT wlroots based, making this point mostly moot anyway.
This protocol at least has a chance of using a custom WM with an advanced compositor (which wlroots is not).
Especially with LLMs, the cost here is down significantly. People also drastically over-idealize what making an X window manager entailed: sure X had it's compositor, but you had to build so so much yourself.
I'm glad River is trying to create a bigger base here; this is way cool. And it sort of proves the value of Wayland: someone can just go do that. Someone can just make a generic compositor/display-server now, with their own new architecture and plugin system, and it'll just work with existing apps.
We were so locked in to such a narrow limited system, with it's own parallel abstraction layer to what the kernel now offers (that didn't exist when X was created). It's amazing that we have a chance for innovation and improvement now. The kernel as a stable base of the pyramid, wlroots/sway as a next layer up, and now River as a higher layer still for folks to experiment and create with. This could not be going better, and there's so much more freedom and possibility; this is such a great engine for iteration and improvement.
[flagged]
"Please don't comment on whether someone read an article. "Did you even read the article? It mentions that" can be shortened to "The article mentions that.""
https://news.ycombinator.com/newsguidelines.html
First time I've seen you gray. What days to live in!
Yes, and I am praising them for tackling the idea. I don't know how you managed to misread me like that. I also read the article before commenting.
Sorry, I didn't address that at you but rather the other replies in this thread.
The Wayland standard does not prescribe it (unlike X), and the reference implementations were monolithic for a very long time.
Wayland in general had a rather cavalier approach to doing away with things that X users take for granted, like, well, making screenshots. Eventually, under pressure, those in charge agreed that these features are actually very important for real users, so implementations appeared. It's an understandable way to discover the minimal usable subset of features, but the process of it is a bit frustrating for the early adopters.
> so implementations appeared
Indeed - implementations, plural. Incompatible with each other, naturally.
We just read titles here
I've never used a system with Wayland (been on i3 for ~15 years) but every time a project like this comes up, I have to wonder why Wayland is even a thing. So many hoops to jump through for things that should be simple.
Sure, X11 has warts but I can make it do basically anything I want. Wayland seems like it will always have too much friction to ever consider switching.
The hoop I recently jumped through:
There's a type of input called "DeviceEvent" which is a bit lower level than "Window event". It also occurs even if the window isn't "active".
Windows and X11 support this, but Wayland doesn't except for mouse movement. I noticed my program stopped working on Linux after I updated it. Ended up switching to Window Events, but still kind of irritating.
> I can make it do basically anything I want
X11 can't do high refresh rates every time that I've tried to do so.
It runs just fine at 165 hz for me. Given that xrandr and CRTs have been around for a while, and both have supported high refresh rates for a long while, something seems fishy here. Something is probably at fault, but it's not X11.
Huh ? It did in 2000.
Sway is basically i3 on Wayland. You pretty much keep your config file (with a few modifications), there really isn’t much friction.
That’s not a reason to do it of course, for me the driver was support for multiple monitors with different scaling requirements.
So that's a Wayland ex-window manager then?
This is a really interesting direction.
Separating the compositor and window manager feels like one of those ideas that seems obvious in hindsight, but the protocol/state-machine design here shows why it took real work to make it practical.
Lowering the barrier for writing Wayland window managers without forcing everyone to build a full compositor seems like a big win.
Are you human? If yes sorry for the offensive question. Your account is new.
If Wayland doesn't get this solved then I'll just use X11 forever, with coding agents to keep it running if I have to.
You could use xlibre, although some people say it's a joke
Insightful article. I don't recall ever viewing an easy-to-follow lesson, tutorial or book for that matter that clearly explained the various components of a Linux Desktop environment. Always had to follow complicated and obscure guides to do this and that, when solving issues, but seldom did any explain their functions clearly.
super interested to hear more on this.
i'm a little thrown, because the Wayland diagram doesn't feel quite right. the compositor does lie between the kernel and the apps, but IIRC the apps have their own graphics buffers from the kernel that they are drawing into directly. the compositor then composites them together. to me, that feels more like the kernel is at the center of the diagram here: the wayland compositor is between the kernel and the output / input.
i don't think it has a huge impact on the discussion here. but this is such a key difference versus X, that i think is hugely under-told: Wayland compositors all rely on lots of kernel facilities to do the job, where-as X is basically it's own kernel, has origins where it effectively was the device driver for the gpu, talking to it over pci, and doing just about everything. when people contrast wayland versus X as wayland compositors needing to do so much, i can't help but chuckle, because it feels like the kernel does >50% of what X used to have to do itself; it's a much simpler world, using the kernel's built-in abstractions, rather than being multiple stacked layers of abstractions (kernels + X's own).
it means that the task of writing the display-server / compositor is much much much simpler. it's still hard! but the kernel is helping so much. there's an assumed base of having working GPU drivers!
author appears to super know their stuff. alas the FOSDEM video they link to is not loading for me. :(
one major question, since this is a protocol, how viable is it to decompose the window management tasks? rather than have a monolithic window manager, does this facilitate multiple different programs working together to run a desktop? not entirely sure the use case, but a more pluggable desktop would be interesting!
>i don't think it has a huge impact on the discussion here. but this is such a key difference versus X, that i think is hugely under-told: Wayland compositors all rely on lots of kernel facilities to do the job, where-as X is basically it's own kernel, has origins where it effectively was the device driver for the gpu, talking to it over pci, and doing just about everything. when people contrast wayland versus X as wayland compositors needing to do so much, i can't help but chuckle, because it feels like the kernel does >50% of what X used to have to do itself; it's a much simpler world, using the kernel's built-in abstractions, rather than being multiple stacked layers of abstractions (kernels + X's own).
Are you an AI bot? Modern X11 server using DRM are more than 20 years old. You are talking about how X11 servers worked in the 90's
The Xorg codebase still includes some of those old drivers and is structured to allow them to exist.
Just to be clear the hardware abstraction layer used by wayland and any current Xserver is exactly the same.
Yes exactly. DRM exists, but there's still what I called the X "kernel", all of it's heavyweight abstractions.
To the previous a-hole, frak you: not an AI. That's rude as frak. Also, you manage to be incredibly wrong. Even an AI wouldn't overlook such an obvious error; maybe it'd be better to have it replace you. So rude dude! Behave!
I am sorry if I mistaken you for a bot but the model you are describing have not been implenented by any graphic driver in decades.
[dead]