Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That was my second thought as well. But if you look at the selector

    $find("._timestamp[data-time$=" + E + "000]")
surely running this regex-find on DOM elements is not such a savings on performance?

It also has the weird side-effect that some tweets can look older than the tweets below them, making the ordering look incorrect. Meh, not really a big deal, but I was glad to have finally figured out what was really going on there.



While I don't have any information on the selector performance, here is a snippet of an email @bcherry, the author of the self-updating timestamps, sent on some performance findings:

"Unfortunately, as you'd guess, [updating timestamps] is slow, and degrades as more tweets get added to the page. With the last 12 hours of a [test account] timeline up (1,150 tweets), it would take 115ms to regenerate all the timestamps. I'd already built in optimizations to mark >24hoursago timestamps as not needing to be checked again, and to not change the text if the text didn't change, but it was still slow. Anything more than 50ms is too slow for a recurring process.

...

The unix timestamps appear to be an even distribution in the last digit, so I just do a query on an even slice of them based on the last digit (i.e. $("._timestamp[data-time$=0"])), and update those, changing digits every 2s. This means it takes a 20s cycle to update them all.

On the same [test account] home timeline with 1,150 tweets, each batch of ~110 timestamps took ~30ms, which is totally reasonable."


Interesting. So the total cumulative time to update the timestamps like this is 300ms (10 * 30ms), whereas the time to update them all in one shot is 115ms. This way it blocks for 30ms every 2 seconds (or 300ms spread over 20 seconds) instead of 115ms every 20 seconds. I'm not sure which is better, but it does sort of confirm that doing the extra regex matching introduces some amount of extra overhead.


Good point. I'm very curious now about the selector overhead. We do a number of operations across varying intervals. Surely one of the goals behind distributing the timestamp updates is to minimize the time we block this pipeline. That being said, it would be interesting to benchmark a variety of update interval and batch size combinations.


I've got extensive notes from last summer of more things to try. I totally believe we can get them counting second-by-second.


It's not a "regex find" as the string is not a regular expression. The string is a CSS selector. jQuery can find DOM elements by CSS selector very quickly by deferring to document.querySelectorAll when available.

If I were facing this performance issue, I'd check to see if there was a fast-performing way to restrict the query to just DOM elements in the viewport. I would guess that check would take longer than the date-updating code, though.


"$=" is a regex-find operator to find the ._timestamp elements with data-time property that ends in E + "000"


"$" is the same operator in a regular expression to denote that the match must be at the end of the string, and is obviously where some of the attribute selector syntax came from, but this has nothing to do with regular expressions and is likely not implemented using them: http://www.w3.org/TR/css3-selectors/


well, ok fine. it's a regex-like operator, which still has to do full suffix matching, which isn't cheap, performance-wise... that's the point.


The DOM performance hit is probably smaller than the network hit (both on the client and at the server side) for doing all at once (if I'm understanding it correctly).


It's all done client-side, so in this case there is no network hit for doing the timestamp updates anyway.


Guess I wasn't understanding it correctly then :)


my guess is the first thing the selector does is query by class, and then filters from that set. likely not as intensive as you think.


yes, I'm guessing it filters by class first as well. but it would be interesting to find out if just updating all the timestamps would be as fast (or faster) than doing the secondary regex filter anyway? i'm not sure, i haven't run any benchmarks on it.


Or, another alternative, add a secondary class so that the overall markup is class="_timestamp _timestamp_group0", etc. Perhaps they tested this and found it to be a negligible improvement, but then you're only selecting on the one class, rather than selecting on class and filtering.


http://jsperf.com/jquery-attribute-selector-performance

Looks like the class only helps slightly. It's definitely doing some optimization though.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: