|
| lucb1e wrote:
| Check your power-on hours: $ sudo smartctl -a
| /dev/sda | grep -e Power_On_Hours -e ^ID ID#
| ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
| UPDATED WHEN_FAILED RAW_VALUE 9 Power_On_Hours
| 0x0032 098 098 000 Old_age Always - 9743
|
| Just looking at the raw value, it seems to be 9'743 hours in my
| case
| borplk wrote:
| Mine is above 53,000 hours ... time to check my backups!
| lucb1e wrote:
| Sounds like you're in the clear for this particular bug...
|
| ...but always check your backups regularly for data that is
| dear to you!
|
| Protip of the day: that includes things on someone else's
| server. I remember when Grooveshark went offline from one day
| to the next and I lost nearly my whole library because I
| remembered only some artists and had to go through thousands
| of songs to find which ones I actually liked from them. My
| browser's localStorage object contained a few playlists but I
| didn't use those much. Or when 000webhost cancelled my
| account because I was using the 100MB(?) to back up some
| files that were most important to me, rather than for actual
| webhosting (in my defense, I was 15 at the time), and so when
| I returned from a holiday with my parents with an actual
| crashed hard drive, that turned double sour. Backing up
| things from what they now call the "cloud" is something I
| learned early, as I have virtually no code I wrote before
| that summer, only some of the music, only essays with WordArt
| if they were printed, etc.
| dredmorbius wrote:
| Possibly related to recent HN issues, see:
| https://news.ycombinator.com/item?id=32031243
| solardev wrote:
| Wow, thanks for sharing. I didn't realize how closely related
| they were.
|
| (TLDR For anyone wondering, "recent HN issues" means HN very
| likely went down yesterday because of this same bug, when two
| (edit: two pairs, four total) enterprise SSDs with old firmware
| died after 40,000 hours close together. An admin of HN and its
| host both like this theory. See details in that thread.)
|
| Edit: If you want to discuss that theory, it's probably better
| to do it in that other thread directly instead... dang and a
| person from M5 Hosting (HN's previous host) are both
| participating there.
| mkl wrote:
| Not two SSDs, _four_ : two in the main server, and two in the
| backup server.
| solardev wrote:
| Thanks for the correction!
| [deleted]
| kazen44 wrote:
| the chance of two SSD's failing at the same time under normal
| circumstances is extremely slim. So this might actually be a
| good cause of this incident.
| MBCook wrote:
| Especially since one pair was a nearly unused backup server
| that had a totally different use profile.
| DoneWithAllThat wrote:
| What the hell is that ridiculous "bias-free language" claptrap at
| the beginning? Man DEI is seriously out of control.
| UkrainianJew wrote:
| It's a bug in the human firmware. People seem to have an
| instinctive and subconscious need for property - out of all
| things occupying our attention span, being able to arbitrarily
| change some on a whim.
|
| I think, this instinct is responsible for humans figuring out
| farming (as in developing the land near you to your liking) and
| many cultural achievements.
|
| Except, with the information society, our attention is being
| constantly overwhelmed by the stream of information produced by
| other people, so this instinct kicks in and makes some people
| want to control what language others use. I don't think we will
| see any studies of this soon, but my hunch is that there is
| reverse correlation between the amount of one's physical
| property and one's sensitivity to the language and content of
| others' speech.
|
| Corporations happily abused it, since letting your employees
| "own" pronouns and acknowledgements is cheaper than paying them
| enough to own their houses (let alone start competing
| companies). Now it has spun into a de-facto religion where many
| people's weight in the society depends on perpetuating (and
| intensifying) the dogmas. Kinda similar to late USSR where most
| people didn't believe in communism anymore, but not having a
| Lenin's room in your office would get you labeled as an
| American spy.
|
| From what we can learn from the history, it will intensify
| until the movement splinters into competing factions, that will
| heavily oppose each other, and will eventually settle on some
| common ground to avoid continuous mutual damage.
| ParetoOptimal wrote:
| It's a device to identify certain kinds of people who would
| have a problem with less loaded language without any loss in
| clarity.
| rajamaka wrote:
| I would love to see some examples of Cisco documentation that
| ever offended anyone.
| mlyle wrote:
| I miss old Cisco documentation, with IP addresses and
| router names like SanJose3 and 408 phone numbers on PRIs
| etc.
| bombcar wrote:
| It's a warning that the documentation may refer to
| master/slave or something like that because Cisco cares
| enough about DEI to update documentation but not enough to
| actually update out-of-support firmware.
| 13of40 wrote:
| OK, here's one they need to update:
|
| https://blogs.cisco.com/news/digital-transformation-
| requires...
|
| The offending text:
|
| Act now by adding "equality, inclusion, and diversity" as
| an agenda item for your next staff meeting, brownbag, or
| employee gathering.
|
| What, why?
|
| https://www.upi.com/Odd_News/2013/08/02/In-Seattle-the-
| terms...
|
| "Brownbag" is offensive in one context, and that means it's
| offensive in all contexts.
| mancerayder wrote:
| Other than DEI administrators, trainers and people in
| positions with DEI in them, who is actually getting offended?
| powerhour wrote:
| People that have to move their mouse a bit to hit the x
| button, apparently.
| deigestapo wrote:
| It's so they can track it, compute metrics, report, etc.
|
| This gets fed into indicators that can be used to boost the ESG
| (communism) score of publicly-traded companies.
| GuB-42 wrote:
| Goes well with the legal disclaimer that follows.
|
| The legal or whatever-not-technical department wanted to leave
| their mark.
| TaylorAlexander wrote:
| Having a statement like that shows people that they are open to
| suggestions on improvements. Since a lot of people are not so
| open to suggestions, it makes sense to me to include this
| language. They added a little X button so you can close it
| easily.
| [deleted]
| hn_throwaway_99 wrote:
| My reaction was "If you want to write some documentation with
| bias-free language, just write the documentation with bias-free
| language." Why the need for a long paragraph explaining "Look
| how great and sensitive we are!"
|
| I understand, and agree with, the desire to use inclusive
| language, but so much of this has just devolved into
| performative nonsense.
| mlyle wrote:
| Else you get questions, like, "why don't you say master/slave
| like everyone else?!@!!"
| kwhitefoot wrote:
| At this stage I think such questions can just be ignored.
| MarcoZavala wrote:
| alpb wrote:
| Saying that and usernames like "DoneWithAllThat" and
| "hn_throwaway", yeah, it checks out.
| hn_throwaway_99 wrote:
| Not sure exactly what point you're trying to make, but if
| it's "the risk of saying anything even _remotely_ critical
| of DEI tactics is a huge, gargantuan, giant career risk
| these days ", then I wholeheartedly agree.
| 0xbadcafebee wrote:
| The docs may include "master/slave", and they don't want to get
| sued or bad PR, so this generic notice says "we don't like bad
| words but sometimes the industry uses bad words and that's
| unfortunate". If you click the _Learn More_ link in the
| paragraph, you 'll learn more.
| redeeman wrote:
| deigestapo wrote:
| zorpner wrote:
| There is -- it's using words other than those, which is
| both easy and considerate.
| civilized wrote:
| It's been over two years since this was first identified... since
| this apparently affected many makes and models of SSDs, it would
| be nice to know if my laptop could be affected and if there's
| anything I could do about it.
| pmoriarty wrote:
| One thing everyone could and should be doing is backups.
| m0llusk wrote:
| Two things: Test restores or you don't actually have backups.
| Just saying.
| chrischen wrote:
| I got bit by this with iPhone backups. I did a phone trade
| in and followed the backup before trading in instructions.
| Problem is after the trade in the backup failed to restore
| due to an unknown error. The whole manual syncing and
| backing up with a cable workflow with Apple is super fickle
| and riddled with bugs.
|
| Luckily I had Time Machine backups of my iOS backups and I
| managed to avoid losing too much data.
|
| As a sidenote it seems like Apple has pretty much neglected
| their offline backup and syncing workflow to drive more
| people to just pay for iCloud storage. Half the time my
| iPhone takes hours just to get detected by the mac when
| _plugged in._
| opencl wrote:
| This will not affect your laptop, all of the models affected by
| this are enterprise SAS SSDs.
|
| Of course your SSD might have some _other_ firmware bug that
| would eat your data, all you can do is search for the model
| number and see if the manufacturer has issued any notices
| /firmware updates.
| robocat wrote:
| > This will not affect your laptop
|
| That's just your presumptive opinion, right?
| Sakos wrote:
| How likely is it that they're using an enterprise SAS SSD
| in their laptop?
| yomkippur wrote:
| crap so its certainly HP laptops. so which laptops are safe from
| this?
| mrkramer wrote:
| My HP laptop has Toshiba SSD. I'm not sure about other models.
| But I think only enterprise SSDs are affected.
| mistrial9 wrote:
| related topic - leaving SMART control tests ON for a (non-SSD)
| drive, apparently interferes with sleep; the drive will wake up
| to test itself. For some drives, I would prefer that not to
| happen and just stay quiet. Yet, testing for this behavior seems
| elusive -- querying the disk wakes it, and most linux disk tools
| seem unaware of sleep state. I just listen for the disk spinning,
| or notice a long pause before an operation.
| onion2k wrote:
| Backblaze have a great blog about things they learn about hard
| drives. It's been going for years, less about firmware issues and
| more about general usage.
| https://www.backblaze.com/blog/backblaze-drive-stats-for-q1-...
| usr1106 wrote:
| Cisco is not a SSD manufacturer. They write industry-wide bug.
| Does that mean that more than one SSD manufacturer is affected
| (because they use partially the same firmware)? Further down they
| mention only Sandisk. Or is the industry-wide just their newspeak
| for saying any Sandisk of affected model, regardless whether
| installed in a Cisco box or somewhere else?
| dr_zoidberg wrote:
| I'm interested here too. I've got a Crucial SSD from 2015
| that's been on about:
|
| * 100% of 2015-2017, let's add 2 years here
|
| * Aboutish 50% of days since 2018 to 2020
|
| * On and off again (5%?) since then until now.
|
| So it's about 3 years of full use? I'm eyeballing the use here.
| So it may be close to the numbers that were given, but I'm not
| sure. Guess I could check the SMART stats to get a precise
| number and from there decide what to do about it.
|
| Searching a bit it seems it's a well-known bug in "enterprise
| SSDs"[0, 1] (which my drive certainly isn't) but there aren't
| any real details about what causes it, other than "a firmaware
| bug".
|
| [0] https://www.servethehome.com/hpe-issues-hpd7-fix-for-ssds-
| th...
|
| [1] https://www.anandtech.com/show/15673/dell-hpe-updates-
| for-40...
| dredmorbius wrote:
| The problem seems to be widely experienced.
|
| The Cisco report turned up in response to a post I'd made of
| the HN issue on the Fediverse:
|
| https://mastodon.infra.de/@galaxis/108622795822100862
| userbinator wrote:
| 40000 (or even 40960) seems an odd number to fail at. 64k or 32k
| would make the cause pretty obvious, but 40000 doesn't seem all
| that round in binary. Perhaps a 12-bit counter incrementing every
| 10h? This is puzzling.
|
| Of course, I am also entertaining the possibility that no one
| thought they would be in use for this long, which would certainly
| be evidence of planned obsolescence.
| twawaaay wrote:
| Very strange understanding of the word "evidence".
|
| No sane SSD manufacturer would do such thing on purpose. You do
| it and you loose business, that's it.
|
| The simplest explanation is that somebody made an honest
| engineering mistake.
| bayindirh wrote:
| When you purchase a server (fleet), you get a long warranty
| with it. Generally 3 to 5 years. So you expect this fleet to
| stay in service for <=5 years mostly.
|
| Unless you burn through your SSDs, you're very unlikely to
| hit this event.
|
| When these servers' continue to be used and disks all start
| to fail at the same time, this will obviously stink.
|
| The bathtub curve is not like this. You can _feel_ that.
| fartcannon wrote:
| Given the power dynamic between a single customer and large
| corporations, the smart thing to do is to assume malice until
| prove otherwise. This puts the onus on the corporations and,
| if we're lucky, creates an environment where they compete
| with each other to be seen as the most honest. The worst
| thing that happens is the single customer has to buy an SSD
| from someone they don't trust.
|
| If we do the opposite, as you say, and assume everything is
| an honest mistake, that puts pressure on the single customer
| to prove that the organization with a huge marketing budget
| is doing something wrong. In this situation, the worst thing
| that happens is we all get taken advantage of.
|
| Our collective distrust is the only power we have against
| massive marketing/PR budgets. It doesn't have to be angry, or
| sour, or cranky, we just collectively need to not take their
| word until we have a reason to do so.
| charcircuit wrote:
| Are you seriously saying that by default we should believe
| they intentionally planned to cause their customers to lose
| all of their data?
| [deleted]
| alliao wrote:
| planned obsolescence is quite a thing...?
| dtjb wrote:
| In some cases, but a product must fulfill its core
| purpose. If a SSD intentionally dumped data and self
| destructed at a set time, that would be disastrous for
| the brand. Same way a car doesn't adopt planned
| obsolescence by blowing up after 200k miles.
| bayindirh wrote:
| If a spinning rust can run for ~8 years without any
| problems,a consumer SSD can hit beyond 40K hours
| reliably, and everything is checked and tested tens of
| times because of the complexity of flash storage, I'd get
| suspicious too.
|
| Also, enterprise drives get firmware updates (regardless
| of spinning or not), and this firmware is automatically
| applied via RAID controller, so it could be remedied
| easily before it got this big if it's an actual error.
| justinsaccount wrote:
| Someone pointed out on the other thread that it could be 2^57
| nanoseconds: >>> 2**57/10**9/3600
| 40031.996687737745
| AaronFriel wrote:
| If it were 53, I'd wonder "are they storing the time in the
| integer part of a double precision float?" That wouldn't go
| negative, it'd just start absorbing increments without
| changing the value.
|
| Though that might cause a divide by zero?
|
| What could cause unexpected behavior at 57 bits?
|
| Perhaps storing fractions of an hour, like incrementing it
| every 1/16th of an hour and calculating a relative rate of
| change, causing a divide by zero?
| mkl wrote:
| Do embedded CPUs like the one in an SSD have floating point
| units? It seems more likely to me that the upper bits in a
| 64 bit integer counter were used for something else.
| danielheath wrote:
| Packing a type flag into the upper bits of a 64 bit value
| is a reasonably common optimisation in dynamic language
| implementations (because it lets you use unboxed number
| arithmetic).
| jonas21 wrote:
| My overactive imagination thinks it went something like
| this:
|
| Engineer A: Gee, I need to store a few flags with each
| block, but there's nowhere to put them. Ah! We're storing
| timestamps as 64-bit _microseconds_. I can use a few of
| those bits and there 'll still be enough to go thousands of
| years without overflowing.
|
| Engineer B: Gee, our SSDs are getting so fast, soon we'll
| be able to hit 1M writes/sec. But we're storing timestamps
| as microseconds. How can we generate unique timestamps for
| each write? Ah! I'll switch to nanoseconds. It's a good
| thing we have plenty of space in this 64-bit int.
|
| BOOM!
___________________________________________________________________
(page generated 2022-07-10 23:00 UTC) |