> The example code is vey simplistic, so of course that linear code is more read...

saurik · on Sept 15, 2023

I don't know if it is still like this, but the code for dpkg used to be like this, and it was amazing: if you ever needed to know in exactly what order various side effects of installing a package happened in, you could just scroll through the one function and it was obvious.

To this end, I'd say it is important to be working in a language that avoids messing up the logic with boiler plate, or building some kind of mechanism (as dpkg did) to ease error handling and shove it out of the main flow; this is where the happy path shines: when it reads like a specification.

realrains · on Sept 15, 2023

I don't think the fact that a function works well is a good enough reason to write a 2000 line function. Sometimes there are long pieces of code that implement complex algorithms that are difficult to break into smaller pieces of code, but those cases are limited to the few you mentioned.

BigJono · on Sept 15, 2023

Computers execute code in a linear fashion, why on earth would you "need a reason" to NOT abstract something? Just because abstraction is often the right thing to do doesn't make it the base case.

It's like saying you need a reason not to add 4000 random jumps in your assembly code just to make it more difficult to read...

ahtihn · on Sept 15, 2023

Source code isn't written to be executed by computers, it's written to be read by other humans.

Source code tends to be very far removed from how computers execute anything, so I wouldn't use that as a justification for any sort of code style.

amoss · on Sept 15, 2023

> Source code isn't written to be executed by computers, it's written to be read by other humans.

It is pronounced "documentation".

nomel · on Sept 15, 2023

> that implement complex algorithms that are difficult to break into smaller pieces of code

My longest code is always image processing. It's usually too hard to break up for the sake of breaking up. There's nothing to reuse between the calls to filters/whatever.

flohofwoe · on Sept 15, 2023

The default should be reversed, don't break into smaller pieces unless there's a really good reason.

coldtea · on Sept 15, 2023

>I don't think the fact that a function works well is a good enough reason to write a 2000 line function.

The fact that it works well and reads well (when it does, as in the parent's case), is.

Aside from those factors what else would be against it? Dogma?

osigurdson · on Sept 15, 2023

I guess all we know is there were 2K lines of code and the commenter thinks that was the right way to do it. It would be necessary to see the code to appropriately critique it.

goatlover · on Sept 15, 2023

Not just the commenter, but his team as well. It passed code review with flying colors, apparently. The moral of the story is that there always exceptions and developers should not be ideologically committed to one approach above all else.

em-bee · on Sept 15, 2023

we know more than that: You could argue that every one of those 9 scopes could be a separate function, but then devs would be tempted to reuse them. Yet, each step had subtle assumptions about what happened before.

what we don't know is if it would have been possible to abstract those assumptions away so that functions could have been defined without them.

PH95VuimJjqBqy · on Sept 16, 2023

We do know that if we trust the poster, they said very clearly it could have been done but they didn't consider the value to outweigh the downsides.

em-bee · on Sept 17, 2023

yes, i meant we don't know if it would have been possible to extract functions in such a way that they are actually safely reusable.

osigurdson · on Sept 15, 2023

Even the contrived example in the post can be factored differently (and better imo). How do we know those 9 scopes are appropriate?

RHSeeger · on Sept 15, 2023

>The moment we would have spent effort to make them distinct functions we would have had to recheck our assumptions, generalize, verify that methods work on their own

Why? Why can't the functions say "to be used by <this other function>, makes assumptions based on that function, do not use externally"? Breaking out code into a function so that the place it came from is easier to maintain... does not mandate that the code broken out needs to be "general purpose".

laserbeam · on Sept 15, 2023

Specifically, in that place, there was no need. And prematurely splitting it would have caused us to overthink and over generalize. Having a long, linear and tested function was a better choice.

auggierose · on Sept 15, 2023

I understand your point, but perhaps that would have simply been an opportunity to refine your approach to code design. If such a situation leads to excessive deliberation and overgeneralisation, your code base must be riddled with unnecessary overthinking and overgeneralisation.

goatlover · on Sept 15, 2023

Or maybe it was just a long, sequential algorithm where breaking it up wouldn't have been an improvement.

auggierose · on Sept 15, 2023

I have been programming for more than 30 years. Except for code generated explicitly to be only consumed by machine, I've never come across a function consisting of 2000 lines of code that should not have been broken up. Something is wrong there, and if you show me the code, I'll tell you what's wrong with it.

wiseowise · on Sept 15, 2023

Glad you can see that without even looking at the code.

auggierose · on Sept 15, 2023

Some things you don't have to see to know whats going on. Function with 2000 lines of code? Have fun rationalising this.

waynesonfire · on Sept 15, 2023

I worked with an engineer that wrote the most clear and elegant linear code. It was remarkable, never seen anything like it since. I can't reproduce it but I do have an idea of what a well designed linear function looks like.. a story.

eep_social · on Sept 15, 2023

I was just thinking that if I _needed_ to refactor this I might structure the stages as chapters in a book. One might be able to write an inner class or some such that had a “table of contents” function that called each stage in sequence as a void function with data managed out of line, maybe via cleverly designed singleton structs. Then the code itself can be written in order with minimal boilerplate between stage boundaries.

I think I’ve worked with some Python that looked and worked this way. I can’t place the details but probably in a processor pipeline running over a particularly hairy data format. Consider ancient specifications written by engineers talking on the phone encapsulated in relatively “modern” but still vintage specifications, sometimes involving screen-scraping a green screen mainframe terminal, wrapped in XML and sent over the internet. Anyway, point is I couldn’t agree more about stories.

laserbeam · on Sept 15, 2023

I will agree that it takes some skill, not that I am great at it. It's a different kind of skill than abstraction. Reading error handling in c code offered good insights for me to learn linearity better (c code that uses goto to jump to the end of a function for cleanup when an error occurs, for example).

However, if you screw up linear code, you screw up locally. If you write poor small functions, the rest of the team screws up because they barely ever read the contents of your functions that call other functions that call other functions. I've had way more problems with stuff being called slightly out of order, than with large functions.

yxhuvud · on Sept 15, 2023

That is true of well designed nonlinear code as well.The code needs to tell a story or it will be a mess.

osigurdson · on Sept 15, 2023

You don't have to write tests to prove that private methods work on their own. Just test the public behaviour.

koonsolo · on Sept 15, 2023

At first I thought how horrible, but basically you have sort of 9 functions within the same scope, each having a docstring. So I guess not too different from splitting them up.

I read you have "end to end" tests.

One question though: Wouldn't each part benefit for having their own unit tests?

laserbeam · on Sept 15, 2023

Maybe, maybe not. For our particular case it would have been mostly wasted effort.

I found that I like to write tests at the level of abstraction I want to keep an implementation stable. I'd be totally fine if someone went in and changed the implementation details of that long process if needed. We cared that stuff got cleaned up at the end of the process, that the output matched certain criteria, that certain user interaction was triggered and so on... In that case it made more sense to test all our expectations for a larger scope of code, rather than "fix" the implementation details.

Tests usually "fix" expectations so they don't change from build to build. Tests don't ensure correctness, they ensure stuff doesn't alter unexpectedly.

PeterisP · on Sept 15, 2023

Tests effectively freeze requirements; you should test those things which should be preserved throughout any changes, and not test those things which should be open to change. In this case, it seems that is no real requirements for any of these 9 steps - perhaps the implementer could figure out how to do the same outcome by skipping a step or merging two steps, and the existence of unit tests for these 9 functions somehow encodes the idea that these 9 functions each are inherently needed, which is not necessarily true.

coldtea · on Sept 15, 2023

>One question though: Wouldn't each part benefit for having their own unit tests?

Not necessarily better, especially since this allows for the case where individual unit tests pass fine, but the combined logic fails.

BenFrantzDale · on Sept 15, 2023

If the sub-functions could be reused and people would be tempted to change them, then that’s what your tests are for. In fact, it’s often tricky to test the sun-function logic without pulling them out because to write the test you have to figure out how to trick the outer function to get into certain states. Follow the Beyoncé rule: if you like it: put a test on it. Otherwise it’s on you if someone breaks it.

emodendroket · on Sept 15, 2023

> You could argue that every one of those 9 scopes could be a separate function, but then devs would be tempted to reuse them.

Good thinking. Now they’ll just add 50 flags and ten levels of nested ifs instead which is much simpler.

patrulek · on Sept 15, 2023

2000 lines is like a small project. I cant imagine putting that all in one function.

reactordev · on Sept 15, 2023

>”but then devs would be tempted to reuse them”

Isn’t that the fucking point? Having a 2000 line function is a code smell so bad, I don’t care how well the function works. It’s an automatic review fail in my book. Abstractions, closures, scope, and most importantly - docs to make sure others use your functions the way you intended them. Jesus.

laserbeam · on Sept 15, 2023

Some devs did find it a code smell... But each scope had a clear short high level comment describing what it did, there were end to end tests for the method, and very little state flowed from scope to scope (some did) - because that's what scoprs do... Prevent variables from leaking.

My point is the code smell isn't always accurate, and there are times and even for 2000 line monsters other devs agreed that it was the best way to hide complexity away from the rest of the codebase in that case. If we ever needed to factor things out (we never did), we could spend some effort and do it.

okaleniuk · on Sept 15, 2023

Have you tried reading code instead of smelling it?

MrPatan · on Sept 15, 2023

A code smell means you should look into it, not that it's wrong.

Some things are genuinely 2kloc-complex. Maybe not that many. Do check! But some are.

laserbeam · on Sept 15, 2023

Definitely not that many. Even for me this was an outlier, but it made me more comfortable with functions most people would consider long.

I'd like to clarify this was not necessarily 2kloc-complex, this was just 2kloc-long-and-not-really-meant-to-be-reused. It was a fairly long but linear process that was out of the ordinary for the rest of the codebase. It could easily have been split (hell, I had 9 fairly separate stages), but calling any of the intermediate stages out of order or without the context of the rest of the execution flow... would have been a foot gun for someone else. And, as time showed, we never needed those stages for anything else.

turdprincess · on Sept 15, 2023

Agreed. I’ve written plenty of software of all kinds and have never had to write a 2000 line long methods (although I have had the joy of refactoring such messeses a time or two).

Just don’t do that. Your code doesn’t have to have abstractions out the wazzo, but if your class (or method) is getting bigger than 1000 lines that’s a great sign that it’s doing too much and abstractions can be teased out. Your future self will thank you, as well as your team.

crabmusket · on Sept 15, 2023

I like this from Sandi Metz:

> You can't create the right abstraction until you fully understand the code, but the existence of the wrong abstraction may prevent you from ever doing so. This suggests that you should not reach for abstractions, but instead, you should resist them until they absolutely insist upon being created.

turdprincess · on Sept 15, 2023

At least in the mobile world, I find that this “no abstraction” approach is the default one, and it usually leads to huge objects which do everything from drawing views to making network requests to interacting with the file system. These kinds of classes are quite hard to work in, hard to test, and also keep snowballing to get bigger and bigger. Things usually end with unmaintainable code and a full rewrite.

I am not saying you need to create complex abstract hiarchies right off the bat. But usually, it’s pretty easy to tease out a couple significant abstractions that are very obvious, and break down your classes by a factor of two or three. Just getting such low hanging fruit will prevent you from ever having a 2000 line long method.

And for the folks who are saying that they make sure to not add abstractions too early - are you disciplined enough to go back and add them later? I feel like if you’re the kind of engineer that busts out 2000 line methods, you’re also not going to refactor it as this method grows to 2500 or 3000 lines or beyond.

Probably most robust software you depend on is full of solid, quality abstractions. Learning to write code like this takes practice. The wrong abstraction might be wrong, but it’s one step closer on your journey to growing as an engineer. You won’t grow if you never try.