> C Doesn't Want to Fix It Or: C knows that *it doesn't need fixing.* How often ...

jeroenhd · on Nov 20, 2023

> Or: C knows that it doesn't need fixing.

People don't like APIs that can randomly crash your program while there's no good technical reason for why they should. Why not fix the problem? People like you, who have no issues with the current implementation, won't see any regressions because you're already a good citizen, and myriad other programmers whose programs do occasionally crash because of this will be helped.

> So, is there a potential issue with thread safetey? Yes. Does it matter given where and under what circumstances it occurs? Not really.

"The unpredictable crashes only happen very rarely" doesn't mean the crashes go away.

> What kind of actual real life production code would continuously set envvars while simutaneously calling a function that tries to read the environment?

The reproduction sample calls setenv in a loop so the issue can be reproduced. A single setenv anywhere in the code is enough to trigger the crash, but then you would get one of those "you need to run the program a million times to reproduce it" bug reports that gets pushed down the line.

usrbinbash · on Nov 20, 2023

> Why not fix the problem?

Because doing so breaks backwards compatibility, simple as that.

The problem isn't even that `setenv` isn't thread save. The problem is that `getenv` returns a `*char` directly into the environment memory space. Many many many programs rely on that being the case.

> People like you

People like me would like every software to be perfect, but that's not the world we live in, so we are forced to be pragmatic. When fixing something causes more problems by breaking backwards compatibility promises, than it prevents, then there is no good argument for a fix, and the correct approach is to say "yes, this sucks, let's document it well so people don't waste too much time on this".

The setenv/getenv problem is such a case. Anyone who disagrees is free to fork glibc, implement whatever fix they think is adequate, and then try to compile the software packages found on a typical Linux server against the result.

> so the issue can be reproduced.

"Can be reproduced" and "is a common issue in production code" are not the same.

Fact is, almost all production programs that set envvars, do so once, very early in the process lifecycle, and then never again, and so are never affected by this.

mastax · on Nov 20, 2023

So why not implement the fix suggested in the article: improve the existing interface to the extent possible, and introduce a new interface which is easier to use correctly.

fch42 · on Nov 20, 2023

There is nothing to "improve" on the existing interface, really. From a C point of view ... a _hidden_ global lock is worse than no lock at all. Because in the latter case ... you, as the programmer, have a choice what to do. If you never call setenv(), no locks. If you only ever call setenv() in your startup code, no locks. If you only ever call setenv() after fork&co, no locks. And if you do believe you need to call it at runtime, but are singlethreaded ... still no locks. And if you really really really need to call it from a multithreaded process, concurrently with getenv(), then lock around both and make your getenv() "safe" wrapper create you an owned point-in-time copy - basically a getenv_r().

Note also that "global references" like getenv() returns and point-in-time owned snapshots don't behave the same way. Say, a library initializer code could retrieve a number of env var references by calling getenv(), and then use those at runtime. No more need/use for getenv() again after - and even perf-sensitive code could look at the env var. With a func that copies, the perf-sensitive code would need to do that each time (lock, lookup, copy). Not strongly desirable.

Also ... UNIX is rather flexible ... and if you so wish, you _can_ substitute _your own_ setenv()/getenv() by the magic of dynamic linking. To create a set that locks and returns you leaked copies (changes the semantics of getenv so that the caller must free the pointer to avoid a leak). It's all possible to do this.

I'm getting the impression from this that we see a "go tantrum" here. "I make my own standards but I wanna use that C/Unix standard thing as well but not how it is because it's not nice it should take go into account waaaahwaaah ...".

It is not _nice_ to modify your own env at runtime. Maybe, just maybe ... that's for reasons. Because not everything that can be done is also a great idea.

rerdavies · on Nov 20, 2023

So why not implement it yourself, instead of polluting the standard runtime with functionality that nobody needs?

zare_st · on Nov 20, 2023

> People don't like APIs that can randomly crash your program while there's no good technical reason for why they should. Why not fix the problem?

I think you're not seeing this from the right POV. People that consume POSIX API need to know POSIX API.

https://pubs.opengroup.org/onlinepubs/009604499/functions/se...

It says loud and clear "The setenv() function need not be reentrant. A function that is not required to be reentrant is not required to be thread-safe."

> "The unpredictable crashes only happen very rarely" doesn't mean the crashes go away.

If you get a crash over setenv() reading the manual page of setenv C call should be your first step. And the only step. The bigger issue is in design of application that has wrongly assumed setenv() is thread-safe. That requires a refactoring and is solely due to developer misunderstanding the API.

tikhonj · on Nov 20, 2023

"RTFM" is not a coherent defense for awful API design and we shouldn't accept it as such.

zare_st · on Nov 20, 2023

Who is "we"?

I'm a UNIX/C programmer for decades and I don't care about this.

There is no such thing as beautiful API design. Every design is a compromise. If you think non-reentrant calls should be deprecated in POSIX take it to the committee.

There is a myriad of non-reentrant code both in POSIX spec and in libc implemenations. You need to RTFM, I'm sorry.

There is no "coherent API" as far as null termination goes too. Some library functions deal with it, some calls don't. You need to RTFM.

I also want to know OP's reason to even use setenv() in a multithreaded piece of software. It's like an oxymoron. setenv and vars are useful to pass on data from parent process to forked children because they inherit the environment. If you use the threading model you don't need it. If your application is a single process setenv() is useless.

usrbinbash · on Nov 20, 2023

What should we accept? That every library is made under the assumption that it has to work as expected, even if people ignore the documentation?

As someone who made and maintains multiple libraries: No. Not gonna happen.

JohnFen · on Nov 20, 2023

Putting aside whether or not the design is awful, the fact that it's standardized and documented is absolutely a valid argument. Changing it now would break backward compatibility. That should always be a showstopper.

Programmers who are using any library code without reading and understanding the documentation are asking for trouble regardless of language.

The correct solution to your objections is to create new functions that behave as you prefer.

wredue · on Nov 20, 2023

The real skinny of it is that it’s in the name: “Environment”.

If you’re calling setenv in the middle of your program, you fucked up.

There are those things in programming that should be extremely triggering to your “what the actual fuck?!” senses, and “setenv in the middle of runtime” is one of those things.

kstrauser · on Nov 20, 2023

True, but for every envvar a program reads, something called setenv on it originally. It’s not like no programs call setenv in the middle of runtime. Examples:

- Shells

- CI runner

- Container launchers

- IDEs

DSMan195276 · on Nov 20, 2023

> but for every envvar a program reads, something called setenv on it originally

That's not true, that's just misunderstanding how it works. `execve()` takes an entirely new copy of environment variables to give to the child, that's the "real" way to do it.

tsukikage · on Nov 20, 2023

The child process's environment for these purposes is constructed without mutating its parent's environment - a copy is used - and before the child process actually runs the target code it was created to run. So there is no possibility of race between mutations to the environment and reads of the environment. If you are writing such a tool but doing something other than this, you are doing it wrong.

stefan_ · on Nov 20, 2023

No, a process gets its environment variables from the operating system (just like argc, argv) before any code is ever executed and the majority never change them.

ric2b · on Nov 20, 2023

Then why does setenv even exist? Maybe that's the issue and it should be deprecated and throw compilation warnings?

loup-vaillant · on Nov 20, 2023

You are literally putting forth arguments in favour of fixing the thread safety issue, and then conclude it’s not worth the effort.

It’s simple, really: we indeed rarely to `setenv()`. So it’s not a performance problem. So we can make it thread safe, and the performance impact will be negligible. In exchange for this small price, safety will increase.

Sacrificing any amount of safety for a negligible improvement in performance is flat out unprofessional, and should be grounds for immediate termination in most contexts.

usrbinbash · on Nov 20, 2023

> You are literally putting forth arguments in favour of fixing the thread safety issue, and then conclude it’s not worth the effort.

Yes. I do. These two concepts don't contradict each other.

> No it’s not a performance problem. So we can make it thread safe, and the performance impact will be negligible

Who said anything about performance being the problem, or a reason not to change it!?

The problem is BACKWARDS COMPATIBILITY. The issue is that `getenv` returns a `*char` into the envvar array. Basically every application that uses this function relies on this fact.

So we have:

a) A potential issue that occurs only in very unusual circumstances, most of which will never occur in production code and on the odd chance that they do, they can easily be avoided. Documenting that well can help prevent time wasted in debugging.

b) A fix that may prevent a) but breaks backwards compatibility promises, and would necessitate reworking god knows how many programs, the vast majority of which were never impacted by the issue in the first place.

Of these 2 options, a) is just the better one. Yes, in an idea world, we could have pure, 100% bug free code, and spend an unlimited amount of time on fixing every last problem. That's not the world we live in however, and so a pragmatic approach is simply a necessity.

DSMan195276 · on Nov 20, 2023

How do you propose making it thread-safe? The real problem here is that `getenv()` was designed around it returning a `char *` into some read-only memory. It's a bad API if the backing data can change because the returned pointer is assumed to exist 'forever'.

`setenv()` has no way to knowing where those pointers are floating around so there's no way to safely change the environment variables. The best you could do would be to leak memory every time you set new environment variables so that the old pointers don't get invalidated, and that just creates a new problem and reason not to use `setenv()` (that's arguably worse).

cnity · on Nov 20, 2023

Here's my proposal: Introduce a new threadsafe API (`tgetenv` or whatever) which takes _two_ `char *`s, one of which is a return buffer. This leaves allocation as a responsibility of the caller.

And then you can leave the existing syscalls as they are (thread unsafe) while having a separate thread safe version.

DSMan195276 · on Nov 20, 2023

I agree that would be the way to do it, but now we're no longer talking about simply 'fixing' the implementation of the existing API but rather introducing a new function you have to use.

`setenv()` would only be safe if your program never uses `getenv()`, and calls to `getenv()` are so numerous and all over the place that for most non-trivial programs it would be hard to ensure they never happen.

There's also the rub that `setenv()` is not part of the C standard, it's POSIX. I don't think the C standard would ever introduce `tgetenv()` to fix a problem it doesn't have, so non-POSIX code would have to continue to call `getenv()` since that's all that is available to them.

josefx · on Nov 20, 2023

> I don't think the C standard would ever introduce `tgetenv()` to fix a problem it doesn't have

The C standard has no problem acknowledging that getenv is subject to data races for most of its implementations. As far as I can tell that part was even added at the same time as threading support.

DSMan195276 · on Nov 20, 2023

Well actually I'll have to eat my words on that one - I didn't catch that Annex K in C11 includes `getenv_s` (even if it is optional).

leoh · on Nov 20, 2023

>you have to use

I mean, why not just deprecate the old one; add a warning if it’s used

DSMan195276 · on Nov 20, 2023

That doesn't really help you determine whether a given library is using `getenv()` or not. That also requires that things are actually recompiled/updated, which for some C libraries is not that often.

There's also the rub that many C programs do not target the latest standard (for a variety of reasons). I didn't realize `getenv_s` was added in C11 (though it's optional), but it doesn't really matter because programs/libraries that target C89 or C99 can't use it anyway.

leoh · on Nov 20, 2023

That’s a good point. I would guess there are ways of doing static analysis to see if a given binary is making getenv() calls though, even if one doesn’t fully grok its source.

Maybe some combo of that with sentenv() in your source or something

Or do “live” analysis under integration and give a low priority warning

But yeah it’s hairy, you’re right

salawat · on Nov 20, 2023

GNU convention is <funcname>_r for a reentrant version of a non reentrant function.

I'm in the process of working on a tool in C at the moment, so for once I actually have some context on what's being grumped about here!

JohnFen · on Nov 20, 2023

Entirely this. It works so well that I've seen this in various utility libraries for decades.

cnity · on Nov 22, 2023

Yeah. Most of my code these days works like this, to the point where almost all of my allocation happens roughly in my main entry-point. Or I use arena allocators and pass those into my utilities. But there's basically never a `realloc` or `malloc` call deep in my code anymore.

forrestthewoods · on Nov 20, 2023

> to safely change the environment variables. The best you could do would be to leak memory every time you set new environment variables so that the old pointers don't get invalidated, and that just creates a new problem and reason not to use `setenv()` (that's arguably worse).

Arguably worse? My goodness no.

This is a rare edge case that most programs don’t encounter. Option 1 is to crash and explode and die. Option 2 is to leak tens of bytes.

Leaking tens of bytes is for sure NOT worse than crashing.

none_to_remain · on Nov 20, 2023

You could even put a note in the man page that `setenv()` will leak memory. Then ten or twenty years from now there will be a blog post about how a currently trendy language/runtime can be manipulated into looping over `setenv()` zillions of times and OOM'ing, and comments about how no one can possibly be expected to read the man page for this horrible footgun, and it's wrong to expect developers to have any idea about what they're doing, give a shit, or pay attention at all.

marcosdumay · on Nov 20, 2023

> Leaking tens of bytes is for sure NOT worse than crashing.

I do really disagree here. The answer is not clear at all.

But then, you are mischaracterizing the problem. The issue is not with crashing, you can get plain bad data too, and this is clearly worse than both leaking memory and crashing.

Also, the GP is mischaracterizing the options. You don't need to leave the old values around, you can just copy them into userspace memory.

DSMan195276 · on Nov 20, 2023

My reasoning is simple - the issues here can be avoided if you're careful about how you use `setenv()` and `getenv()`, which many programs already are. The memory leak in contrast would never be avoidable regardless of how you use it.

forrestthewoods · on Nov 20, 2023

The problem with “be careful” is that libraries often want to use the very unsafe API and there is no standard mechanism to expose safety. It’s fundamentally a bad API design. It could be good. But it is not.

There’s a reason this problem comes up on HN once or two a year. And don’t even get me started about printf grabbing a mutex for a stupid locale…

leoh · on Nov 20, 2023

This is not a good argument imo. Its “rarity” still affects a tremendous number of folks in profoundly vexing ways that are difficult to debug on account of this not only affecting C but innumerable other languages’ compilers and interpreters that rely on the stock getenv implementation.

I wouldn’t be surprised if a good chunk of compilers and interpreters in other languages suffer from this gotcha’.

I mean, I wouldn’t even be surprised if some JVM implementations silently expose their users to bugs on account of this implementation.

EDIT: … ha, sure looks like it https://github.com/openjdk/jdk/blob/a2c0fa6f9ccefd3d1b088c51...

zlg_codes · on Nov 20, 2023

Ooh, the old 'unprofessional' epithet! What do you mean by that slur here? Most can't agree on what professional even means. Additionally, why should one be held to artificial, inconsistent, and poorly defined standards of 'professionalism' when they aren't a professional?

My care for code robustness scales with income.

loup-vaillant · on Nov 21, 2023

> Ooh, the old 'unprofessional' epithet! What do you mean by that slur here?

For an action I generally mean "malpractice". Something bad enough to bar repeat offenders from the profession (if we even were a profession, which we’re not). For a person I mean "unfit to program code other people rely on".

> My care for code robustness scales with income.

Good point: the conditions for us to write code that actually works are too rarely met. The only answer I have for this one is political though, not technical.

MrBuddyCasino · on Nov 20, 2023

Unpopular opinion: Neither Go's os.Setenv nor Rust's std::env::set_var() should exist. I was pleased to find that Java only has System.getEnv(), but not a setter.

jeroenhd · on Nov 20, 2023

I think there are good reasons for Setenv and set_var to exist, but if they are implemented, they shouldn't be wrappers around POSIX' shitty API and implement their own environment variable system instead (one of which the initial variables are possibly initialised by a call to getenv to make them compatible).

There's no reason why these languages need to restrict themselves the same way C does.

OskarS · on Nov 20, 2023

That doesn't fix the problem: these languages has to be able to coexist peacefully with C in the same address space. You can have a dynamically linked library written in Rust in a host program written in C, you can use C libraries in Go, etc.

Even if that wasn't an issue: this is a bug in C as well! You should absolutely be able to use setenv/getenv safely in multi-threaded C, it's insanity that you can't.

MrBuddyCasino · on Nov 20, 2023

The bug in Golang was because DNS lookups interact with the C library, which looks up environment variables. As long as everything happens in Goland, there is no problem - but this is simply not good enough.

jeroenhd · on Nov 20, 2023

Go makes the assumption that the DNS lookups are thread-safe, but it doesn't have that guarantee (or the C library is spec-incompliant, but I doubt that). It's still something Go can fix.

You can't fix C libraries loaded into Go programs (i.e. and external library calling C's setenv, or I suppose explicit FFI calls by the user), but Go can be responsible for the APIs it calls itself. That may necessitate writing a thread-safe alternative for DNS lookups, or documenting and/or adding compile time warnings that threaded programs doing DNS lookups will just crash sometimes, but the language's standard library can still make it much harder for developers to write buggy code.

MrBuddyCasino · on Nov 20, 2023

My impression is that this was Golang's plan from the start - this is why they didn't want to use the C stdlib at all, issuing the Kernel syscalls directly from the Golang runtime. A good idea, but then they had to backpedal to solve issues such as DNS resolution respecting certain OS settings, and this bug is a symptom of that.

fch42 · on Nov 20, 2023

Yes, there are certain things in UNIX which _are_ part of the standard (POSIX / IEEE1003) but _aren't_ usually implemented as system calls.

Name lookups (whether user identities or network resources) are the biggest chunk of these. You have a "choice" as a user/programmer here. Say, the existing name lookup interfaces in most libc implementations don't do DNS-over-HTTP (DoH); you can implement that yourself and just use the addresses returned by your library/package where the system calls ... want addresses.

If you have the go stance, go all the way. Don't say "the C runtime is sh*te but I really really really want that one particular teensy tiny bit of it could someone somewhere somehow please do something to make it a little less sh*te". Legacy baggage is a burden and backwards compatibility shackles you. The C/Unix interfaces are full of this, and with the hindsight of 50 years noone today, not even "C programmers", would implement them all the same way again. But that doesn't mean their behaviour can be arbitrarily changed.

o11c · on Nov 20, 2023

> Go makes the assumption that the DNS lookups are thread-safe

DNS functions are thread-safe.

The thing people aren't understanding here is when you set loose nasal demons (such as by calling `setenv` in a multithreaded program), they can cause problems even in safe code.

dwattttt · on Nov 20, 2023

If a function is safe only if everyone else you rely on never calls a particular function, it's not that safe. Certainly less safe than other functions guaranteed not to result in crashes if you use them right.

o11c · on Nov 21, 2023

You can make any safe function fail trivially with appropriate uses of unsafe functions.

For example, imagine the chaos of `memset(stderr&-4096, 0x42, 4096)`.

usrbinbash · on Nov 20, 2023

That is an unpopular opinion for the simple reason that some programs do in fact need to set envvars, particularly programs that will start child processes.

pajko · on Nov 20, 2023

Nope, execve() and friends ending in 'e' accept a pointer to a completely new set of environment variables, no need to do setenv. Windows has _execve() too.

usrbinbash · on Nov 20, 2023

The fact that there is an alternative doesn't change the fact that a lot of software relies on the worse method to work.

naniwaduni · on Nov 20, 2023

It makes it a pretty silly idea to invite people writing new programs in a new language to use that method, though.

rerdavies · on Nov 20, 2023

A lot of software doesn't modify the environment when exec-ing.

usrbinbash · on Nov 21, 2023

So? A lot of software doesn't use file compression, should we remove these libraries?

MrBuddyCasino · on Nov 20, 2023

That is still possible using the java.lang.ProcessBuilder API: you can launch a child process and give it a modified environment, but just at launch time. This side-steps the issue.

ryan-c · on Nov 20, 2023

Programs that need to set environment variables for child processes should use `execvpe` or `execle`.

account42 · on Nov 20, 2023

Or posix_spawn / posix_spawnp.

badsectoracula · on Nov 20, 2023

Or programs that rely on libraries that for some unfathomable reason expose some functionality only via environment variables without an API.

looks at SDL

account42 · on Nov 20, 2023

There is SDL_SetHint [0] which doesn't modify the environment but instead changes the value internally to SDL only.

[0] https://wiki.libsdl.org/SDL2/SDL_SetHint

badsectoracula · on Dec 1, 2023

That is SDL2, what i had in mind is SDL1 - but also the SDL1-on-SDL2 implementation which had some SDL2-specific extras (for things like scaling).

kevincox · on Nov 20, 2023

I think the probably is really that there are 2 times where you should be setting env vars 99% of the time.

1. Right after program startup before any threads are spawned.

2. After a fork before an exec.

In both cases it can be known that no threads are running. (Ok, for 1 it can actually be non-trivial if you have code before main or if you call functions that spawn helper threads, but let's assume that you can know this).

However no languages actually have ways to enforce this. So the APIs can be called at any time and are huge footguns.

I think that the proposed improvement of `getenv_s` is great. It is cheap and easy to use, then software can slowly migrate off of the less safe stuff. You can imagine that if libc stopped using `getenv` internally most of this problem would be solved.

Blikkentrekker · on Nov 20, 2023

No, in many cases one needs to set them interactively.

Consider for instance something as simple a implementing a shell. Such a program needs to be able to set the environment based on user interaction and this change needs to show up in /proc/$pid/env.

__david__ · on Nov 20, 2023

Why does a shell need its current environment to be visible in /proc/$pid/env (as opposed to just its initial environment)?

Blikkentrekker · on Nov 30, 2023

Because the specification of the POSIX shell says that `export` changes the current environment of the running process, not just of any newly started processes.

This is useful to recognize various processes I suppose. I have written code that scans the environment of processes to find particular processes and group them together.

rerdavies · on Nov 20, 2023

If you need to set environment variables for child processes in a thread-safe manner, use execvpe or execle.

crabbone · on Nov 21, 2023

This is a bizarre take...

Programmers have to deal with a lot of badly written programs all of the time. You'd need this functionality to either debug a program that responds differently to different values of environment variables, or to control it, because, maybe it's the only reasonable way to do so.

It's OK to say that programmers shouldn't rely on this functionality ideally, but, for practical reasons, this functionality is needed. Same happens in "pure" functional languages, for example, when you need to debug programs in such languages interactively, and struggle to create the program state that reproduces the problem, or, in some extreme cases, due to I/O being "impure" even struggle to output diagnostic information.

grodriguez100 · on Nov 20, 2023

I fully agree with your unpopular opinion.

xbar · on Nov 20, 2023

While I am sure that thousands of hours have been spent debugging threaded setenv() attempts (and developing & discarding Annex K), it is clearly not a problem that needs a solution.

Languages that compile to C need be careful not to promise thread-safe implementations of POSIX or C functions that are explicitly documented as not reliably thread-safe, including setenv(). The author seems to want to change C, and POSIX, so that Go can reliably do so.

somat · on Nov 20, 2023

Right, why is the problem "changing memory in a shared memory execution model will cause corruption" and not "Why are we using such a fragile shared memory execution model in the first place."