I’m curious how software can be created and evolve over time. I’m afraid that at some point, we’ll realize there are issues with the software we’re using that can only be remedied by massive changes or a complete rewrite.
Are there any instances of this happening? Where something is designed with a flaw that doesn’t get realized until much later, necessitating scrapping the whole thing and starting from scratch?
Happens all the time on Linux. The current instance would be the shift from X11 to Wayland.
And then ALSA to all those barely functional audio daemons to PulseAudio, and then again to PipeWire. That sure one took a few tries to figure out right.
And the strangest thing about that is that neither PulseAudio nor Pipewire are replacing anything. ALSA and PulseAudio are still there while I handle my audio through Pipewire.
How is PulseAudio still there? I mean, sure the protocol is still there, but it’s handled by
pipewire-pulse
on most systems nowadays (KDE specifically requires PipeWire).Also, PulseAudio was never designed to replace ALSA, it’s sitting on top of ALSA to abstract some complexity from the programs, that would arise if they were to use ALSA directly.
Pulse itself is not there but its functionality is (and they even preserved its interface and pactl). PipeWire is a superset of audio features from Pulse and Jack combined with video.
For anyone wondering: Alsa does sound card detection and basic IO at the kernel level, Pulse takes ALSA devices and does audio mixing at the user/system level. Pipe does what Pulse does but more and even includes video devices
And then from ALSA to PulseAudio haha
They’re at different layers of the audio stack though so not really replacing.
Wayland, Pipewire, systemd, btrfs/zfs, just to name a few.
Wayland is THE replacement to broken, hack-driven, insecure and unmaintainable Xorg.
Pipewire is THE replacement to the messy and problematic audio stack on Linux to replace Pulseaudio, Alsa etc.
SystemD is THE replacement to SysVinit (and is an entire suite of software)
Like many, I am not a fan of SystemD and hope something better comes along.
Yes, I know. I was answering the question of if there were instances of this happening.
I would say the whole set of C based assumptions underlying most modern software, specifically errors being just an integer constant that is translated into a text so it has no details about the operation tried (who tried to do what to which object and why did that fail).
Assembly doesn’t have concept of objects.
It does very much have the concept of objects as in subject, verb, object of operations implemented in assembly.
As in who (user foo) tried to do what (open/read/write/delete/…) to which object (e.g. which socket, which file, which Linux namespace, which memory mapping,…).
implemented in assembly.
Indeed. Assembly is(can be) used to implement them.
As in who (user foo) tried to do what (open/read/write/delete/…) to which object (e.g. which socket, which file, which Linux namespace, which memory mapping,…).
Kernel implements it in software(except memory mappings, it is implemented in MMU). There are no sockets, files and namespaces in ISA.
You were the one who brought up assembly.
And stop acting like you don’t know what I am talking about. Syscalls implement operations that are called by someone who has certain permissions and operate on various kinds of objects. Nobody who wants to debug why that call returned “Permission denied” or “File does not exist” without any detail cares that there is hardware several layers of abstraction deeper down that doesn’t know anything about those concepts. Nothing in the hardware forces people to make APIs with bad error reporting.
And why “Permission denied” is bad reporting?
Because if a program dies and just prints strerror(errno) it just gives me “Permission denied” without any detail on which operation had permissions denied to do what. So basically I have not enough information to fix the issue or in many cases even to reproduce it.
It may just not print anything at all. This is logging issue, not “C based assumption”. I wouldn’t be surprised if you will call “403 Forbidden” a “C based assumtion” too.
But since we are talking about local program, competent sysadmin can
strace
program. It will print arguments and error codes.
You have stderr to throw errors into. And the constants are just error codes, like HTTP error codes. Without it how computer would know if the program executed correctly.
You throw an exception like a gentleman. But C doesn’t support them. So you need to abuse the return type to also indicate “success” as well as a potential value the caller wanted.
Exceptionss are bad coding, and what’s abusive of using the full range of an integer? 0 success, everything else, error - check the API for details or call
strerror
.Returning error codes in-band is the reason for a significant percentage of C bugs and security holes when the return value is used without checking. Something like Rust’s Result type that forces you to distinguish the two cases is much better design here. And no, you are not working with a whole language ecosystem of “sufficiently disciplined programmers” so that nobody ever forgets to check a return value.
Not to mention that errno is just a very broken design in the times of modern thread and event systems, signals, interrupts and all kinds of other ways to produce race conditions and overwrite the errno value before it is checked.
errno is bad programming.
So you need to abuse the return type to also indicate “success” as well as a potential value the caller wanted.
You don’t need to.
Returnung structs, returning by pointer, signals, error flags, setjmp/longjmp, using cxa for exceptions(lol, now THIS is real abuse).
stderr is useless if the syscall already returns a single integer only because of stupid C conventions.
You mean 0 indicating success and any other value indicating some arbitrary meaning? I don’t see any problem with that.
Passing around extra error handling info for the worst case isn’t free, and the worst case doesn’t happen 99.999% of the time. No reason to spend extra cycles and memory hurting performance just to make debugging easier. That’s what debug/instrumented builds are for.
Passing around extra error handling info for the worst case isn’t free, and the worst case doesn’t happen 99.999% of the time.
The case “I want to know why this error happened” is basically 100% of the time when an error actually happens.
And the case of “Permission denied” or similar useless nonsense without any details costing me hours of my life in debugging time that wouldn’t be necessary if it just told me permission for who to do what to which object happens quite regularly.
“0.001% of the time, I wanna know every time 👉😎👉”
Yeah, I get that. But are we talking about during development (which is why we’re choosing between C and something else)? In that case, you should be running instrumented builds, or with debug functionality enabled. I agree that most programs just fail and don’t tell you how to go about enabling debug info or anything, and that could be improved.
For the “Permission Denied” example, I also assume we’re making system calls and having them fail? In that case it seems straight forward: the user you’re running as can’t access the resource you were actively trying to access. But if we’re talking about some random log file just saying “Error: permission denied” and leaving you nothing to go on, that’s on the program dumping the error to produce more useful information.
In general, you often don’t want to leak more info than just Worked or Didn’t Work for security reasons. Or a mix of security/performance reasons (possible DOS attacks).
During development is just about the only time when that doesn’t matter because you have direct access to the source code to figure out which function failed exactly. As a sysadmin I don’t have the luxury of reproducing every issue with a debug build with some debugger running and/or print statements added to figure out where exactly that value originally came from. I really need to know why it failed the first time around.
As sysadmin you should know about strace
I know about strace, strace still requires me to reproduce the issue and then to look at backtraces if nobody bothered to include any detail in the error.
Somehow (lack of) backtrace and details in error is “C based assumption”
Yeah, so it sounds like your complaint is actually with application not propagating relevant error handling information to where it’s most convenient for you to read it. Linux is not at fault in your example, because as you said, it returns all the information needed to fix the issue to the one who developed the code, and then they just dropped the ball.
Maybe there’s a flag you can set to dump those kinds of errors to a log? But even then, some apps use the fail case as part of normal operation (try to open a file, if we can’t, do this other thing). You wouldn’t actually want to know about every single failure, just the ones that the application considers fatal.
As long as you’re running on a turing complete machine, it’s on the app itself to sufficiently document what qualifies as an error and why it happened.
Ugh, I do not miss C…
Errors and return values are, and should be, different things. Almost every other language figured this out and handles it better than C.
It’s more of an ABI thing though, C just doesn’t have error handling.
And if you do exception handling wrong in most other languages, you hamstring your performance.
The unofficial C motto “Make it fast, who gives a shit about correctness”
Errors and return values are, and should be, different things.
That’s why errno and return value are different things.
Not too relevant for desktop users but NFS.
No way people are actually setting it up with Kerberos Auth
100% this
We need a networked file system with real authentication and network encryption that’s trivial to set up and that is performant and that preserves unix-ness of the filesystem, meaning nothing weird like smb, so you can just use it as you would a local filesystem.
The OpenSSH of network filesystems basically.
So sshfs or sftp?
Performance of those is atrocious.
Easy, Gimp.
By extension, GTK. I was shocked to learn it’s C with object orientation bolted onto it.
Starting anything from scratch is a huge risk these days. At best you’ll have something like the python 2 -> 3 rewrite (leaving scraps of legacy code all over the place), at worst you’ll have something like gnome/kde (where the community schisms rather than adopting a new standard). I would say that most of the time, there are only two ways to get a new standard to reach mass adoption.
-
Retrofit everything. Extend old APIs where possible. Build your new layer on top of https, or javascript, or ascii, or something else that already has widespread adoption. Make a clear upgrade path for old users, but maintain compatibility for as long as possible.
-
Buy 99% of the market and declare yourself king (cough cough chromium).
Python 3 wasn’t a rewrite, it just broke compatibility with Python 2.
In a good way. Using a non-verified bytes type for strings was such a giant source of bugs. Text is complicated and pretending it isn’t won’t get you far.
-
Systemd. Nuke it from fucking orbit.
Still a lot of zealotry going on judging by the votes… i say rewrite systemd in rust!
Everyone hates on it. Here I am; a simply Silverblue user and it seems fine to me. What is the issue actually?
Everyone doesn’t. Just a handful of loud idiots who did all don’t work with init systems. It is objectively better. There are some things you could criticise, but any blanket statement like that is just category a.
Maybe not exaclly Linux, sorry for that, but it was first thing that get to my mind.
Web browsers really should be rewritten, be more modular and easier to modify. Web was supposed to be bulletproof and work even if some features are not present, but all websites are now based on assumptions all browsers have 99% of Chromium features implemented and won’t work in any browser written from scratch now.Agreed. I mean, metadata should be protocol stuff, not document stuff. And rendering (font size etc) should be user side, not developer side. Browser should be modular, not a monolith. Creating a webpage should be easy again.
The same guys who create Chrome have stuffed the web standards with needlessly bloated fluff that makes it nearly impossible for anyone else to implement it. If alternative browsers have to be a thing again, we need a new standard, or at least the current standard with significantly large portions removed.
we need to just rebuild the web, built on a decentralized LoRa or such mesh network.
WWW ≠ Internet.
Internet has some things to be fixed too (https://secushare.org/broken-internet), but is not as doomed as the web.
That sounds like amp, no thanks.
Most of the standards themselves aren’t the problem, we just shouldn’t have to rely so badly on them that a site immediately is dead if a small item is not available
If you want a browser truly written “from scratch”, you need to check-out Ladybird:
Omg nobody has mentioned FHS?!
What that?
Amended my post 😉
Basically, install Linux on your daily driver, and hide your keyboard for a month. You’ll discover just what needs quality of life revising
I dont know if this even makes sense but damn if bluetooth/ audio could get to a point of “It just works”.
It does for me. What issue are you having?
What’s your latest disfavor?
Mine is the priorisation of devices. If someone turns on the flatshare BT box and I’m listening to Death Metal over my headphones, suddenly everyone except me is listening to Death Metal.
Not to mention bluez aggressive conne ts to devices. It would be nice if my laptop in the other room didn’t interrupt my phones connection to my earbuds.
Then again, we also have wired for a reason. Hate all you want but it works and is predicable
Just being… crappy?
Not connecting automatically. Bad quality. Some glitchy artifacts. It gets horrible The only work around I’ve found is stupid but running
apt reinstall --purge bluez gnome-bluetooth
and it works fine. So annoying but I have to do this almost every day.Reinstalling should change nothing. If its getting corrupted check your drive and Ram.
I don’t know why this works, but if im having issues, i do this, and it fixes all of them across the board. Even just restarting the service is not as effective as this. That some times works, sometimes doesn’t.
I’m confident its not a drive or ram issue. Its a blue tooth issue/ audio. But I also can’t explain why it is so consistent.
That really sounds like shitty firmware at one end or the other
Have you checked the logs?
It’s been a while (few years actually) since I even tried, but bluetooth headsets just won’t play nicely. You either get the audio quality from a bottom of the barrel or somewhat decent quality without microphone. And the different protocol/whatever isn’t selected automatically, headset randomly disconnects and nothing really works like it does with my cellphone/windows-machines.
YMMV, but that’s been my experience with my headsets. I’ve understood that there’s some propietary stuff going on with audio codecs, but it’s just so frustrating.
Bluetooth in general is just a mess and it’s sad that there’s no cross-platform sdk written in C for using it.
It’s actually a classic programmer move to start over again. I’ve read the book “Clean Code” and it talks about a little bit.
Appereantly it would not be the first time that the new start turns into the same mess as the old codebase it’s supposed to replace. While starting over can be tempting, refactoring is in my opinion better.
If you refactor a lot, you start thinking the same way about the new code you write. So any new code you write will probably be better and you’ll be cleaning up the old code too. If you know you have to clean up the mess anyways, better do it right the first time …
However it is not hard to imagine that some programming languages simply get too old and the application has to be rewritten in a new language to ensure continuity. So I think that happens sometimes.
Yeah, this was something I recognized about myself in the first few years out of school. My brain always wanted to say “all of this is a mess, let’s just delete it all and start from scratch” as though that was some kind of bold/smart move.
But I now understand that it’s the mark of a talented engineer to see where we are as point A, where we want to be as point B, and be able to navigate from A to B before some deadline (and maybe you have points/deadlines C, D, E, etc.). The person who has that vision is who you want in charge.
Chesterton’s Fence is the relevant analogy: “you should never destroy a fence until you understand why it’s there in the first place.”
I’d counter that with monolithic, legacy apps without any testing trying to refactor can be a real pain.
I much prefer starting from scratch, while trying to avoid past mistakes and still maintaining the old app until new up is ready. Then management starts managing and new app becomes old app. Rinse and repeat.
The difference between the idiot and the expert, is the expert knows why the fences are there, and can do the rewrite without having to relearn lessons. But if you’re supporting a package you didn’t originally write, a rewrite is much harder.
Which is something I always try to explain to juniors: writing code is cool, but for your sake learn how to READ code.
Not just understanding what it does, but what was it all meant to do. Even reading your own code is a skill that needs some focus.
Side note: I hate it to my core when people copy code mindlessly. Sometimes it’s not even a bug, or a performance issue, but something utterly stupid and much harder to read. But because they didn’t understand it, and didn’t even try, they just copy-pasted it and went on. Ugh.
“you should never destroy a fence until you understand why it’s there in the first place.”
I like that; really makes me think about my time in building-games.
There’s already a lot of people rewriting stuff in Rust and Zig.
What are the advantages of Zig? I’ve seen lots of people talking about it, but I’m not sure I understand what it supposedly does better.
Tiny learning curve, easy to refactor existing projects
The goal of the zig language is to allow people to write optimal software in a simple and explicit language.
It’s advantage over c is that they improved some features to make things easier to read and write. For example, arrays have a length and don’t decay to pointers, defer, no preprocessor macros, no makefile, first class testing support, first class error handling, type inference, large standard library. I have found zig far easier to learn than c, (dispite the fact that zig is still evolving and there are less learning resources than c)
It’s advantage over rust is that it’s simpler. Ive never played around with rust, but people have said that the language is more complex than zig. Here’s an article the zig people wrote about this: https://ziglang.org/learn/why_zig_rust_d_cpp/
In reality this happens all the time. When you develop a codebase it’s based on your understanding of the problem. Over time you gain new insights into the environment in which that problem exists and you reach a point where you are bending over backwards to implement a fix when you decide to start again.
It’s tricky because if you start too early with the rewrite, you don’t have a full understanding, start too late and you don’t have enough arms and legs to satisfy the customers who are wanting bugs fixed in the current system while you are building the next one.
… or you hire a new person who knows everything and wants to rewrite it all in BASIC, or some other random language …
There are many instances like that. Systemd vs system V in it, x vs Wayland, ed vs vim, Tex vs latex vs lyx vs context, OpenOffice vs liber office.
Usually someone identifies a problem or a new way of doing things… then a lot of people adapt and some people don’t. Sometimes the new improvement is worse, sometimes it inspires a revival of the old system for the better…
It’s almost never catastrophic for anyone involved.
Some of those are not rewrites but extensions/forks
I’d say only open/libreoffice fits that.
Edit: maybe Tex/latex/lyx too, but context is not.
LaTeX and ConTeXt are both macros for TeX. LyX is a graphical editor which outputs LaTeX.
Yes… I’d classify context as a reboot of latex.