How often do I need to `setenv()` anything? The answer is "Never" in the vast majority of programs, because ENVVRS are usually read rather than set, so this issue is nonexistent for them.
For the vast majority of the small amount of programs that actually need to use `setenv()`, the answer is: "Maybe once or twice during the entire lifetime of the process, and then only at the very start, probably even before running any threads", meaning this issue is nonexistent for them as well.
So, is there a potential issue with thread safetey? Yes. Does it matter given where and under what circumstances it occurs? Not really.
What kind of actual real life production code would continuously set envvars while simutaneously calling a function that tries to read the environment?
Yes, this is a footgun. But even the issues author acknowledges, in the issue thread:
Realistically: this is a pretty rare problem, and documenting
it is probably a fine solution. This is probably going to cost
someone else a couple of days of debugging every couple of
years
> It has wasted thousands of hours of people's time, either debugging the problems, or debating what to do about it.
People don't like APIs that can randomly crash your program while there's no good technical reason for why they should. Why not fix the problem? People like you, who have no issues with the current implementation, won't see any regressions because you're already a good citizen, and myriad other programmers whose programs do occasionally crash because of this will be helped.
> So, is there a potential issue with thread safetey? Yes. Does it matter given where and under what circumstances it occurs? Not really.
"The unpredictable crashes only happen very rarely" doesn't mean the crashes go away.
> What kind of actual real life production code would continuously set envvars while simutaneously calling a function that tries to read the environment?
The reproduction sample calls setenv in a loop so the issue can be reproduced. A single setenv anywhere in the code is enough to trigger the crash, but then you would get one of those "you need to run the program a million times to reproduce it" bug reports that gets pushed down the line.
Because doing so breaks backwards compatibility, simple as that.
The problem isn't even that `setenv` isn't thread save. The problem is that `getenv` returns a `*char` directly into the environment memory space. Many many many programs rely on that being the case.
> People like you
People like me would like every software to be perfect, but that's not the world we live in, so we are forced to be pragmatic. When fixing something causes more problems by breaking backwards compatibility promises, than it prevents, then there is no good argument for a fix, and the correct approach is to say "yes, this sucks, let's document it well so people don't waste too much time on this".
The setenv/getenv problem is such a case. Anyone who disagrees is free to fork glibc, implement whatever fix they think is adequate, and then try to compile the software packages found on a typical Linux server against the result.
> so the issue can be reproduced.
"Can be reproduced" and "is a common issue in production code" are not the same.
Fact is, almost all production programs that set envvars, do so once, very early in the process lifecycle, and then never again, and so are never affected by this.
So why not implement the fix suggested in the article: improve the existing interface to the extent possible, and introduce a new interface which is easier to use correctly.
There is nothing to "improve" on the existing interface, really. From a C point of view ... a _hidden_ global lock is worse than no lock at all. Because in the latter case ... you, as the programmer, have a choice what to do. If you never call setenv(), no locks. If you only ever call setenv() in your startup code, no locks. If you only ever call setenv() after fork&co, no locks. And if you do believe you need to call it at runtime, but are singlethreaded ... still no locks. And if you really really really need to call it from a multithreaded process, concurrently with getenv(), then lock around both and make your getenv() "safe" wrapper create you an owned point-in-time copy - basically a getenv_r().
Note also that "global references" like getenv() returns and point-in-time owned snapshots don't behave the same way. Say, a library initializer code could retrieve a number of env var references by calling getenv(), and then use those at runtime. No more need/use for getenv() again after - and even perf-sensitive code could look at the env var. With a func that copies, the perf-sensitive code would need to do that each time (lock, lookup, copy). Not strongly desirable.
Also ... UNIX is rather flexible ... and if you so wish, you _can_ substitute _your own_ setenv()/getenv() by the magic of dynamic linking. To create a set that locks and returns you leaked copies (changes the semantics of getenv so that the caller must free the pointer to avoid a leak). It's all possible to do this.
I'm getting the impression from this that we see a "go tantrum" here. "I make my own standards but I wanna use that C/Unix standard thing as well but not how it is because it's not nice it should take go into account waaaahwaaah ...".
It is not _nice_ to modify your own env at runtime. Maybe, just maybe ... that's for reasons. Because not everything that can be done is also a great idea.
It says loud and clear "The setenv() function need not be reentrant. A function that is not required to be reentrant is not required to be thread-safe."
> "The unpredictable crashes only happen very rarely" doesn't mean the crashes go away.
If you get a crash over setenv() reading the manual page of setenv C call should be your first step. And the only step. The bigger issue is in design of application that has wrongly assumed setenv() is thread-safe. That requires a refactoring and is solely due to developer misunderstanding the API.
I'm a UNIX/C programmer for decades and I don't care about this.
There is no such thing as beautiful API design. Every design is a compromise. If you think non-reentrant calls should be deprecated in POSIX take it to the committee.
There is a myriad of non-reentrant code both in POSIX spec and in libc implemenations. You need to RTFM, I'm sorry.
There is no "coherent API" as far as null termination goes too. Some library functions deal with it, some calls don't. You need to RTFM.
I also want to know OP's reason to even use setenv() in a multithreaded piece of software. It's like an oxymoron. setenv and vars are useful to pass on data from parent process to forked children because they inherit the environment. If you use the threading model you don't need it. If your application is a single process setenv() is useless.
Putting aside whether or not the design is awful, the fact that it's standardized and documented is absolutely a valid argument. Changing it now would break backward compatibility. That should always be a showstopper.
Programmers who are using any library code without reading and understanding the documentation are asking for trouble regardless of language.
The correct solution to your objections is to create new functions that behave as you prefer.
The real skinny of it is that it’s in the name: “Environment”.
If you’re calling setenv in the middle of your program, you fucked up.
There are those things in programming that should be extremely triggering to your “what the actual fuck?!” senses, and “setenv in the middle of runtime” is one of those things.
True, but for every envvar a program reads, something called setenv on it originally. It’s not like no programs call setenv in the middle of runtime. Examples:
> but for every envvar a program reads, something called setenv on it originally
That's not true, that's just misunderstanding how it works. `execve()` takes an entirely new copy of environment variables to give to the child, that's the "real" way to do it.
The child process's environment for these purposes is constructed without mutating its parent's environment - a copy is used - and before the child process actually runs the target code it was created to run. So there is no possibility of race between mutations to the environment and reads of the environment. If you are writing such a tool but doing something other than this, you are doing it wrong.
No, a process gets its environment variables from the operating system (just like argc, argv) before any code is ever executed and the majority never change them.
You are literally putting forth arguments in favour of fixing the thread safety issue, and then conclude it’s not worth the effort.
It’s simple, really: we indeed rarely to `setenv()`. So it’s not a performance problem. So we can make it thread safe, and the performance impact will be negligible. In exchange for this small price, safety will increase.
Sacrificing any amount of safety for a negligible improvement in performance is flat out unprofessional, and should be grounds for immediate termination in most contexts.
> You are literally putting forth arguments in favour of fixing the thread safety issue, and then conclude it’s not worth the effort.
Yes. I do. These two concepts don't contradict each other.
> No it’s not a performance problem. So we can make it thread safe, and the performance impact will be negligible
Who said anything about performance being the problem, or a reason not to change it!?
The problem is BACKWARDS COMPATIBILITY. The issue is that `getenv` returns a `*char` into the envvar array. Basically every application that uses this function relies on this fact.
So we have:
a) A potential issue that occurs only in very unusual circumstances, most of which will never occur in production code and on the odd chance that they do, they can easily be avoided. Documenting that well can help prevent time wasted in debugging.
b) A fix that may prevent a) but breaks backwards compatibility promises, and would necessitate reworking god knows how many programs, the vast majority of which were never impacted by the issue in the first place.
Of these 2 options, a) is just the better one. Yes, in an idea world, we could have pure, 100% bug free code, and spend an unlimited amount of time on fixing every last problem. That's not the world we live in however, and so a pragmatic approach is simply a necessity.
How do you propose making it thread-safe? The real problem here is that `getenv()` was designed around it returning a `char *` into some read-only memory. It's a bad API if the backing data can change because the returned pointer is assumed to exist 'forever'.
`setenv()` has no way to knowing where those pointers are floating around so there's no way to safely change the environment variables. The best you could do would be to leak memory every time you set new environment variables so that the old pointers don't get invalidated, and that just creates a new problem and reason not to use `setenv()` (that's arguably worse).
Here's my proposal: Introduce a new threadsafe API (`tgetenv` or whatever) which takes _two_ `char *`s, one of which is a return buffer. This leaves allocation as a responsibility of the caller.
And then you can leave the existing syscalls as they are (thread unsafe) while having a separate thread safe version.
I agree that would be the way to do it, but now we're no longer talking about simply 'fixing' the implementation of the existing API but rather introducing a new function you have to use.
`setenv()` would only be safe if your program never uses `getenv()`, and calls to `getenv()` are so numerous and all over the place that for most non-trivial programs it would be hard to ensure they never happen.
There's also the rub that `setenv()` is not part of the C standard, it's POSIX. I don't think the C standard would ever introduce `tgetenv()` to fix a problem it doesn't have, so non-POSIX code would have to continue to call `getenv()` since that's all that is available to them.
> I don't think the C standard would ever introduce `tgetenv()` to fix a problem it doesn't have
The C standard has no problem acknowledging that getenv is subject to data races for most of its implementations. As far as I can tell that part was even added at the same time as threading support.
That doesn't really help you determine whether a given library is using `getenv()` or not. That also requires that things are actually recompiled/updated, which for some C libraries is not that often.
There's also the rub that many C programs do not target the latest standard (for a variety of reasons). I didn't realize `getenv_s` was added in C11 (though it's optional), but it doesn't really matter because programs/libraries that target C89 or C99 can't use it anyway.
That’s a good point. I would guess there are ways of doing static analysis to see if a given binary is making getenv() calls though, even if one doesn’t fully grok its source.
Maybe some combo of that with sentenv() in your source or something
Or do “live” analysis under integration and give a low priority warning
Yeah. Most of my code these days works like this, to the point where almost all of my allocation happens roughly in my main entry-point. Or I use arena allocators and pass those into my utilities. But there's basically never a `realloc` or `malloc` call deep in my code anymore.
> to safely change the environment variables. The best you could do would be to leak memory every time you set new environment variables so that the old pointers don't get invalidated, and that just creates a new problem and reason not to use `setenv()` (that's arguably worse).
Arguably worse? My goodness no.
This is a rare edge case that most programs don’t encounter. Option 1 is to crash and explode and die. Option 2 is to leak tens of bytes.
Leaking tens of bytes is for sure NOT worse than crashing.
You could even put a note in the man page that `setenv()` will leak memory. Then ten or twenty years from now there will be a blog post about how a currently trendy language/runtime can be manipulated into looping over `setenv()` zillions of times and OOM'ing, and comments about how no one can possibly be expected to read the man page for this horrible footgun, and it's wrong to expect developers to have any idea about what they're doing, give a shit, or pay attention at all.
> Leaking tens of bytes is for sure NOT worse than crashing.
I do really disagree here. The answer is not clear at all.
But then, you are mischaracterizing the problem. The issue is not with crashing, you can get plain bad data too, and this is clearly worse than both leaking memory and crashing.
Also, the GP is mischaracterizing the options. You don't need to leave the old values around, you can just copy them into userspace memory.
My reasoning is simple - the issues here can be avoided if you're careful about how you use `setenv()` and `getenv()`, which many programs already are. The memory leak in contrast would never be avoidable regardless of how you use it.
The problem with “be careful” is that libraries often want to use the very unsafe API and there is no standard mechanism to expose safety. It’s fundamentally a bad API design. It could be good. But it is not.
There’s a reason this problem comes up on HN once or two a year. And don’t even get me started about printf grabbing a mutex for a stupid locale…
This is not a good argument imo. Its “rarity” still affects a tremendous number of folks in profoundly vexing ways that are difficult to debug on account of this not only affecting C but innumerable other languages’ compilers and interpreters that rely on the stock getenv implementation.
I wouldn’t be surprised if a good chunk of compilers and interpreters in other languages suffer from this gotcha’.
I mean, I wouldn’t even be surprised if some JVM implementations silently expose their users to bugs on account of this implementation.
Ooh, the old 'unprofessional' epithet! What do you mean by that slur here? Most can't agree on what professional even means. Additionally, why should one be held to artificial, inconsistent, and poorly defined standards of 'professionalism' when they aren't a professional?
> Ooh, the old 'unprofessional' epithet! What do you mean by that slur here?
For an action I generally mean "malpractice". Something bad enough to bar repeat offenders from the profession (if we even were a profession, which we’re not). For a person I mean "unfit to program code other people rely on".
> My care for code robustness scales with income.
Good point: the conditions for us to write code that actually works are too rarely met. The only answer I have for this one is political though, not technical.
Unpopular opinion: Neither Go's os.Setenv nor Rust's std::env::set_var() should exist. I was pleased to find that Java only has System.getEnv(), but not a setter.
I think there are good reasons for Setenv and set_var to exist, but if they are implemented, they shouldn't be wrappers around POSIX' shitty API and implement their own environment variable system instead (one of which the initial variables are possibly initialised by a call to getenv to make them compatible).
There's no reason why these languages need to restrict themselves the same way C does.
That doesn't fix the problem: these languages has to be able to coexist peacefully with C in the same address space. You can have a dynamically linked library written in Rust in a host program written in C, you can use C libraries in Go, etc.
Even if that wasn't an issue: this is a bug in C as well! You should absolutely be able to use setenv/getenv safely in multi-threaded C, it's insanity that you can't.
The bug in Golang was because DNS lookups interact with the C library, which looks up environment variables. As long as everything happens in Goland, there is no problem - but this is simply not good enough.
Go makes the assumption that the DNS lookups are thread-safe, but it doesn't have that guarantee (or the C library is spec-incompliant, but I doubt that). It's still something Go can fix.
You can't fix C libraries loaded into Go programs (i.e. and external library calling C's setenv, or I suppose explicit FFI calls by the user), but Go can be responsible for the APIs it calls itself. That may necessitate writing a thread-safe alternative for DNS lookups, or documenting and/or adding compile time warnings that threaded programs doing DNS lookups will just crash sometimes, but the language's standard library can still make it much harder for developers to write buggy code.
My impression is that this was Golang's plan from the start - this is why they didn't want to use the C stdlib at all, issuing the Kernel syscalls directly from the Golang runtime. A good idea, but then they had to backpedal to solve issues such as DNS resolution respecting certain OS settings, and this bug is a symptom of that.
Yes, there are certain things in UNIX which _are_ part of the standard (POSIX / IEEE1003) but _aren't_ usually implemented as system calls.
Name lookups (whether user identities or network resources) are the biggest chunk of these. You have a "choice" as a user/programmer here. Say, the existing name lookup interfaces in most libc implementations don't do DNS-over-HTTP (DoH); you can implement that yourself and just use the addresses returned by your library/package where the system calls ... want addresses.
If you have the go stance, go all the way. Don't say "the C runtime is sh*te but I really really really want that one particular teensy tiny bit of it could someone somewhere somehow please do something to make it a little less sh*te". Legacy baggage is a burden and backwards compatibility shackles you. The C/Unix interfaces are full of this, and with the hindsight of 50 years noone today, not even "C programmers", would implement them all the same way again. But that doesn't mean their behaviour can be arbitrarily changed.
> Go makes the assumption that the DNS lookups are thread-safe
DNS functions are thread-safe.
The thing people aren't understanding here is when you set loose nasal demons (such as by calling `setenv` in a multithreaded program), they can cause problems even in safe code.
If a function is safe only if everyone else you rely on never calls a particular function, it's not that safe. Certainly less safe than other functions guaranteed not to result in crashes if you use them right.
That is an unpopular opinion for the simple reason that some programs do in fact need to set envvars, particularly programs that will start child processes.
Nope, execve() and friends ending in 'e' accept a pointer to a completely new set of environment variables, no need to do setenv. Windows has _execve() too.
That is still possible using the java.lang.ProcessBuilder API: you can launch a child process and give it a modified environment, but just at launch time. This side-steps the issue.
I think the probably is really that there are 2 times where you should be setting env vars 99% of the time.
1. Right after program startup before any threads are spawned.
2. After a fork before an exec.
In both cases it can be known that no threads are running. (Ok, for 1 it can actually be non-trivial if you have code before main or if you call functions that spawn helper threads, but let's assume that you can know this).
However no languages actually have ways to enforce this. So the APIs can be called at any time and are huge footguns.
I think that the proposed improvement of `getenv_s` is great. It is cheap and easy to use, then software can slowly migrate off of the less safe stuff. You can imagine that if libc stopped using `getenv` internally most of this problem would be solved.
No, in many cases one needs to set them interactively.
Consider for instance something as simple a implementing a shell. Such a program needs to be able to set the environment based on user interaction and this change needs to show up in /proc/$pid/env.
Because the specification of the POSIX shell says that `export` changes the current environment of the running process, not just of any newly started processes.
This is useful to recognize various processes I suppose. I have written code that scans the environment of processes to find particular processes and group them together.
Programmers have to deal with a lot of badly written programs all of the time. You'd need this functionality to either debug a program that responds differently to different values of environment variables, or to control it, because, maybe it's the only reasonable way to do so.
It's OK to say that programmers shouldn't rely on this functionality ideally, but, for practical reasons, this functionality is needed. Same happens in "pure" functional languages, for example, when you need to debug programs in such languages interactively, and struggle to create the program state that reproduces the problem, or, in some extreme cases, due to I/O being "impure" even struggle to output diagnostic information.
While I am sure that thousands of hours have been spent debugging threaded setenv() attempts (and developing & discarding Annex K), it is clearly not a problem that needs a solution.
Languages that compile to C need be careful not to promise thread-safe implementations of POSIX or C functions that are explicitly documented as not reliably thread-safe, including setenv(). The author seems to want to change C, and POSIX, so that Go can reliably do so.
Right, why is the problem "changing memory in a shared memory execution model will cause corruption" and not "Why are we using such a fragile shared memory execution model in the first place."
Or: C knows that it doesn't need fixing.
How often do I need to `setenv()` anything? The answer is "Never" in the vast majority of programs, because ENVVRS are usually read rather than set, so this issue is nonexistent for them.
For the vast majority of the small amount of programs that actually need to use `setenv()`, the answer is: "Maybe once or twice during the entire lifetime of the process, and then only at the very start, probably even before running any threads", meaning this issue is nonexistent for them as well.
So, is there a potential issue with thread safetey? Yes. Does it matter given where and under what circumstances it occurs? Not really.
> such as Go's os.Setenv (Go issue)
Here is the link to the "issue":
https://github.com/golang/go/issues/63567
What kind of actual real life production code would continuously set envvars while simutaneously calling a function that tries to read the environment?
Yes, this is a footgun. But even the issues author acknowledges, in the issue thread:
> It has wasted thousands of hours of people's time, either debugging the problems, or debating what to do about it.Source?