Cryptographic Flaw in Libbitcoin Explorer Cryptocurrency Wallet

Cryptographic flaws still matter. Here’s a flaw in the random-number generator used to create private keys. The seed has only 32 bits of entropy.

Seems like this flaw is being exploited in the wild.

EDITED TO ADD (8/14): A good explainer.

Tags: cryptocurrency, keys, random numbers

Posted on August 10, 2023 at 7:12 AM • 37 Comments

Comments

Ted • August 10, 2023 8:30 AM

Prof Bill Buchanan OBE also wrote up an excellent (and easy to digest) explainer on Milk Sad as he does on lots of cryptography topics and news.

“A Novice Mistake: Meet Milk Sad … And The 32-bit Key!!!!!!”

https://medium.com/asecuritysite-when-bob-met-alice/a-novice-mistake-meet-milk-sad-and-the-32-bit-key-ba308fb2b633

Clive Robinson • August 10, 2023 10:43 AM

@ Bruce,

If I remember rightly it was 94 or 95 when you wrote about the lack of entropy in a well known browser of the time. They had also decided to keep it hidden via “security by obscurity”.

If memory serves it used only SysTime, ProcID and UsrID (So next to no entropy either).

These people are doing even less…

That’s just under 30years that this has been publically known to be not just a very bad idea, but totally insecure.

As this is in a “financial application” and millions if not billions of dollars could be expected to be held behind it…

It’s difficult not to go to “Malice” rather than “incompetence”.

Speaking of which, who remembers the 2008 Debian line deleation followed by the even worse clean up that got immortalized by XKCD 424,

https://xkcd.com/424/

David in Toronto • August 10, 2023 11:46 AM

So to summarize, they use a more secure algorthim seeded with a small seed and an insecure PRNG.

Or to use the puffed rice analogy, the seed is the number of a single grain of rice in a bag that they boil up to fill an entire gigantic swimming pool! They mistakenly thought they have the security of guessing a single grain of rice in the entire pool. But in reality you still only need to try all the numbers in the small bag.

Classic DIY problem.

iAPX • August 10, 2023 12:08 PM

Even with a hard-drive to do the lookup with public-key hashmap and storing related 2^32 private keys, a basic computer could identify exposed wallets faster than it could generate the transactions, knowing how much is in each wallet because it’s exposed through the blockchain! lol!

This is humongous!

There might be billions at stake, only for a flawed PRNG.
And you don’t really know if the software stack that you use, on your smartphone usually or in your computer, use this libbitcoin that seems very popular!

I have very mixed feeling about crypto-currencies and bitcoin specifically, but wasn’t expecting what it looks like a serious supply chain attack…

sanford h • August 10, 2023 12:37 PM

@ Clive Robinson,

That’s just under 30years that this has been publically known to be not just a very bad idea, but totally insecure.

That’s a rather generous phrasing. I think cryptographers always knew the need for entropy, and that bug was just one of the most visible early fuckups to prove that point (assuming it was not a backdoor). Unfortunately, most operating systems and languages continued to make entropy collection overly difficult for the next two decades or so. It’s a little better now, but only a little.

It’s crazy how much of a shitshow it is in C, especially. We’ve got rand(), rand_r(), random(), rand_r(), and a whole set of *rand48() functions. Some are not thread-safe. Others need to be seeded, and where’s that seed gonna come from? None are secure, so we should probably just forget them. One can maybe open /dev/random, except /dev/urandom is now considered better, and getentropy() is preferred because none of the files work if the namespace blocks /dev or the process is out of file descriptors. Even getentropy() is defined such that it can fail, and maybe on some systems it will, so that needs to be handled. MS Windows, of course, is entirely different.

One could just call OpenSSL’s function to get entropy, though someone already mentioned the well-known failure in that. And it’s only really appropriate for cryptographic code. If I just want to make, say, a randomized binary tree (to avoid algothimic complexity attacks)—computer-science papers just hand-wave away the entropy source—I may not want to link against that library.

Every language should just have a well-known function to randomly fill a buffer, and some wrappers to generate numbers in a specified range (avoiding bias), but we’re still not there.

Clive Robinson • August 10, 2023 3:59 PM

@ iAPX, RobertT, SpaceLifeForm, ALL,

Re : A chain has two ends and many links some forged badly.

“… but wasn’t expecting what it looks like a serious supply chain attack…”

We know that the closest –in time– end of the chain is people getting value taken off of the block chain thus out of their pocket.

But where is the other end of the chain?

How far do we stretch back –in time– or do we just stop at the first badly forged link?

That is do we stop at,

1, The crooks took advantage here.
2, The developers failed here.
3, The library designers failed to document sufficiently here.
…
…
…
T-Nt, The NSA started finessing information and standards to do with RNGs here.

@RobertT and myself have had a conversation over this after @SpaceLifeForm posted the original link to milksad,

https://www.schneier.com/blog/archives/2023/08/friday-squid-blogging-2023-squid-oil-global-market-report.html/#comment-425344

@RobertT certainly feels we should go a very long way back, at least a quater of a century “publically”

“This cannot be accidental, nobody that has been actively involved in secure payment systems at any point in the last 25 years, can be in any doubt about the weaknesses of PRNG’s.”

I was playing with breaking the PRNG in BASIC on the PrimosOS back in 1978, as I’d used it to write some games and they gave suspicious “sequence” results even though it appeared statistically flat (turns out it was a crap AddC-n-ModN style you could seed. The give away was the offset value as the first value out after you seeded it).

@RobertT also kind of makes it clear he thinks I think it goes back further and he does as well,

“It might be time to ask ourselves who is setting these weak TRNG standards and why? but that’s a question way beyond my pay grade.”

If we start going back through the historical record we find that @RobertT is correct when he says,

“anyone working in security systems should have known this weakness as far back as the late 70’s, Clive will probably say 60’s and others will add 10 years to that number.”

I knew as a student from my own researching[1] back in the later half of the 70’s computer RNGs were bad news in many ways not just for games but especially for codes, and not a lot of use for simulators either (it screwed up my satellite orbit analysis testing).

But we also know from various records that the NSA were all over LFSRs and trying to stop information about them and their taps becoming public. There is the anecdote about a friend phoning a friend at the puzzle palace to get some tap info on LFSRs and getting rather more than the brush off.

I had this sort of behaviour confirmed toward the end of the 1980’s, when involved with British and Comenwealth Crypto Kit in particular the unreliable BID 610[2].

Originally I’d assumed the “weakening” within the BID 610/1 combiner circuits was incase the units got captured because it alows the weak/strong key divide seen in other examinable potentially “field use” thus capturable systems that the UK and US SigInt agencies designed. However these days I suspect it was more likely to keep certain alies less clue-less about certain techniques being used against them (something the NSA tried to do to GCHQ, that they got caught out on, but did not flag up GCHQ was actively doing to them). So as the old saying has it “No honour amoungst thieves”.

So @RobertT suggesting 1950’s and the NSA and UK equivalent might well be correct…

[1] Yes it sounds “grandiose” but my mother was a historian and my father had been a senior accountant and involved in forensic accounting before going on to teach the subject. Also he was a member of the ACM in the 1950’s and later and had kept the journals. They both taught me to research from at least as early as when I was a Boy Cub and throughout school. I guess even earlier as I helped my mum do “field research” when she was hunting down Gunpowder Mills in North Surrey and their connection to Art and Artists (see the famous John Everett Millais painting of “Ophelia”, and his aquaintance Holman Hunt and his “Light of the World” as just two of the painting locations we tracked down to the actual spots used).

[2] The BID 610 was one of the first “all transistor” crypto units, and was based on pre-1960’s theoretical and practical work that went into the then “Oh so secret” Liverpool based “Plessey Crypto”. They developed it as an entrant for the NATO “Tapeless Rotorless On-Line”(TROL) evaluation for intra-national military use. As such it was not “top notch” even for the pre 1960’s as can be seen if you take the time to work through the two combiner “non-linear” circuits from the two LFSRs (documentation has appeared on-line if you look for it).

modem phonemes • August 10, 2023 4:16 PM

Obligatory

Your sins will find you out 😉

“ Any’one who considers arithmetical methods of producing random digits is, of course, in a state of sin.“

https://mcnp.lanl.gov/pdf_files/InBook_Computing_1961_Neumann_JohnVonNeumannCollectedWorks_VariousTechniquesUsedinConnectionwithRandomDigits.pdf

Jay_B • August 10, 2023 5:59 PM

See, this is why I create my own encryption algorithms! /s

SpaceLifeForm • August 10, 2023 7:27 PM

Do not use a clock.

Tick, tick, tick

Bitflip, bitflip, bitflip

There may be a pattern there

Ticktickman • August 10, 2023 8:01 PM

@ SpaceLifeForm

Repent !

https://en.m.wikipedia.org/wiki/%22Repent,_Harlequin!%22_Said_the_Ticktockman

Clive Robinson • August 10, 2023 8:25 PM

@ SpaceLifeForm,

Re : Time is not universal, but progress for man all but is.

“Bitflip, bitflip, bitflip”

You make the call (239-11-1517) on,

Which, the stream of time is a mear cipher as is the intent of the others code of living.

Mr C • August 11, 2023 1:18 AM

Gee, it’s almost as if these cryptobros aren’t actually qualified to develop security-sensitive code…

Ted • August 11, 2023 2:28 AM

This is all so weird 😵‍💫

There be some interesting dialogue happening on Twitter/X with Libbitcoin’s author. From @evoskuil:

We removed all platform entropy trust. The pseudorandom class (in libbitcoin-system) is used in scenarios in other libs that do not call for true randomness. Using it for the purpose of non-production demo seeding is fairly obvious.

https://twitter.com/evoskuil/status/1689819889022689280

A secure PRNG is not secure without a truly random seed. Software cannot produce randomness, and hardware cannot be trusted to do so – unless maybe you build it yourself.

https://twitter.com/evoskuil/status/1689795719991787521

I really don’t know what to make of this.

Commentary from a Reddit post:

Furthermore, some people have pointed out that development of the library ceased around the same time that the first related theft of funds occurred. If this is true, this seems like suspicious timing.

Indeed, it’s something.

https://www.reddit.com/r/Bitcoin/comments/15nbzgo/psa_severe_libbitcoin_vulnerability_if_you_used/?rdt=64856

Winter • August 11, 2023 5:51 AM

Ted

Software cannot produce randomness, and hardware cannot be trusted to do so – unless maybe you build it yourself.

The basic message of this post and comments is that you should know your entropy.

It is well known that there is always entropy when the temperature is above absolute zero degrees Kelvin. Which it always is, everywhere in the universe.

Recording entropy is also very easy. Every measuring device, eg, microphone, antenna, or camera is recording environmental “noise”. Put a microphone near my computer fan, or point a camera at a windy tree canopy and you have entropy. Moreover, to spy on my environment, other measurement devices in my specific location are necessary, which is expensive and does not scale.

And I can easily hear and see whether my recorded entropy is really entropy.

So what is the problem?

I think it is ideology, users and efficiency.

Ideology: Entropy must be perfect. People think it must be either the clock or perfect quantum pure randomness. Which means they end up with the clock time or a broken chip.
Users: The lowest denominator in users is they don’t really care. So they really do not want to know or chech whether stuff works. Any system set-up that records environmental entropy will break because it will at some point stop recording and the user will not check.
Efficiency: Whatever entropy source we will use, it will not be “purely” random. Therefore, a system must be found that is better so we have more bits for less money. And then the system is not robust anymore and will simply fail catastrophically.

As usual, security is hard and users do not appreciate it until it fails

Clive Robinson • August 11, 2023 7:53 AM

@ Mr C, ALL,

“Gee, it’s almost as if these cryptobros aren’t actually qualified to develop security-sensitive code…”

Well… If you look at what’s been going on at Blackhat about Crypto-Wallets you will see others have serious faults.

https://www.bleepingcomputer.com/news/cryptocurrency/new-bitforge-cryptocurrency-wallet-flaws-lets-hackers-steal-crypto/

But you could take a flip view that they’ve not been clever enough to sufficiently hide their “criminal enterprise schemes”.

I think most know my views on the crypto currency systems being first and formost a set of short abd long cons designed to seperate less clever speculators from their money.

Maybe some have decided the con games are conning to an end so they are just “Cleaning up by more obvious criminality”.

But remember that all these supposadly secure systems have been “given for free” and that maybe,

“People should look a gift horse in the mouth…”

Something the Trojan’s[1] might have regretted not doing…

Remember “Free” and “Open” are not indicators of “quality” or “security” and at the end of the day around about 3-5% of any large Western population are considered crooks or worse.

[1] Alegadly after a decade of the Trojan War, and a stalemate besiging the city of Troy, the Greek Army had not beaten the Trojan’s. The Greek legand says Odysseus decided a new trick should be tried. In that the Greeks built a large wooden tribute “Gift horse” that Odysseus and a few select Greek solders hid in. The Greek main forces then appeared to “sail away”. That night Odysseus and the small Greek force crept out of the horse and opened the gates for the rest of the Greek army to enter the city which they then destroyed without mercy. Whilst the Wooden Horse is certainly myth, –this was the end of the Bronze age,– the small force “commando tactics” to open the city gates are entirely plausable.

Mr C • August 11, 2023 9:14 AM

@Clive:

Are you suggesting this was a deliberate backdoor? That strikes me as a really dumb criminal scheme, since the developer responsible for a bug like this is certain to come under investigative scrutiny.

modem phonemes • August 11, 2023 9:28 AM

@ Clive Robinson

Re: Trojan Horse

There are some additional details that might be relevant in security situations.

Troy had a horse cult, so the Horse was even more a social engineering hack than would be just any gift. A warning was sounded by Laocoön, who then with his sons was destroyed by an an agency that seemed more than human, typical for situations where the messenger says something people don’t want to hear. The Horse was tested to see if there was some trick: Helen walked around the Horse calling and speaking to soldiers who might be inside, using the voices of their wives, to tempt the soldiers to speak. But the hackers were disciplined and ignored the honey pot.

iAPX • August 11, 2023 9:34 AM

Are you suggesting this was a deliberate backdoor?

I am.
This is so obviously a fail that the Security Team evaluating it have to check it twice to be sure, and did a POC to convince itself and others that the PRNG seed was only a 32 bits time…

This is NOT as if CSPRNG was a new discovery, PRNG flaws wasn’t known, nor backdoored PRNG were issued for decades by the 3-letters orgs.

There are good resources online, with great authors, such as Bruce Schneier himself, that described all that and prescribed usage of non-broken and non-backdoored (as much as we know) CSPRNG.

sanford h • August 11, 2023 11:52 AM

Re: wallets with serious faults, possible backdoors

Are cryptocurrency wallets more flawed that other software? Bruce’s readers might hope security-sensitive code is written to a higher standard, but we’ve seen many signs it’s not. Linus Torvalds once said that “security problems are just bugs.” Most software is absolutely full of bugs. I’ve sometimes filed 20 or 30 in a work day, when assigned to test some unfamiliar code for the first time.

Error-handling, in particular, is often skipped. Almost nobody writes the proper EINTR loops for POSIX code (which are stupid, to be fair, but a library cannot assume the absence of signal handlers or the presence of SA_RESTART), or handles someone redirecting their “hello, world” program’s output to /dev/full, or consistently handles out-of-memory conditions (there are whole languages that just skip this, and libraries that call abort()—sometimes without warning their users).

It kind of sucks, though, that we have few useful formalisms to deal with most of this. For example, if getrandom(buf, 32, 0) returns less than 32, it’s an error to assume anything about the last byte of buf. Ideally, a compiler would know that and warn a programmer if any code path could lead to that behavior. Something like Perl’s “taint mode” that could verify all 32 bytes of data passed to a crypto function came from a good entropy source would be cool.

The backdoor theories being raised are interesting, but assume the programmers involved knew anything about crypto. I don’t know about this case; in general, it might be naïve: I wouldn’t be surprised to see someone say “this function said to give it random data, so I called rand(); what’s the problem?”. Or to take some PRNG from a theoretical book or webpage; the Mersenne Twister looks fun to implement, right? Its Wikipedia article doesn’t mention the lack of cryptographic security till about three pages in, and who knows whether the implementer will read that far or realize it means “someone will steal your ‘money’ ”? Even if it were intentional, the programmer would just need plausible deniability or “reasonable doubt”, and such flaws existed long before it was possible to easily and anonymously profit from them; the Debian PRNG failure was found the year before Bitcoin was released, for example. As far as I know, neither security nor cryptography nor verification/bug-hunting/code-review are covered by the required courses of most computer science programs.

I’ve seen Bitcoin described as the most successful bug bounty program ever created, intentionally or otherwise. I suppose that statement could use some slight amendment to cover the trojan horse potential. There are literally dozens of cryptocurrency wallets, and it’d be surprising if nobody tried this obvious attack vector by now. Maybe a bit surprising to do it in open-source code, but that also helps deniability.

iAPX • August 11, 2023 1:01 PM

During our accelerated coordinated disclosure to the Libbitcoin team, the Libbitcoin team quickly disputed the relevancy of our findings and the CVE assignment.

They might be a team of dumb and dumber, but as the technical explanation by the Security Team is very easy to understand, you don’t need to have a PhD to figure out that if a wallet is emptied without user interaction or information in 1 day from a simple fast PC, this is no more a wallet.

We could agree that this team is toxic…

Ted • August 11, 2023 9:19 PM

@Winter

As usual, security is hard and users do not appreciate it until it fails

I’m chuckling one of Prof Buchanan’s concluding remarks: “This is sheep-following-sheep.”

I almost feel like I could substitute the word ‘hustlers.’

Eric Voskuil said there were “explicit warnings against live wallet use.” Yet $900,000 was stolen. Apparently the warning wasn’t overly explicit.

The Distrust et al Milk Sad technical write up offers a few more details. Here are some excerpts:

Only pseudo-random? Alright, a pseudo-random number generator (PRNG) doesn’t have to be bad if it’s a Cryptographically Secure Pseudo Random Number Generator (CSPRNG)…

Wait a moment. mt19937, twister – this uses the Mersenne Twister PRNG? 🤔 At this point, the first alarm bells are going off. Mersenne Twister is not a CSPRNG, so it shouldn’t be in any code path that generates secrets…

What the hell !? A bad PRNG algorithm, seeded with only 32 bit of system time, used to generate long-lived wallet private keys that store cryptocurrency? 😧

Are you familiar with programming lingo? (I really am not.) Is “uint32_t” an explicit integer value for 32 bits??

https://milksad.info/disclosure.html#our-cryptocurrency-is-gone-but-how

sanford h • August 12, 2023 12:28 PM

@ Ted,

Is “uint32_t” an explicit integer value for 32 bits??

The cast to uint32_t converts its argument to an unsigned 32-bit integer, that argument apparently being the system time expressed as nanoseconds.

Ted • August 14, 2023 6:18 AM

Thanks @sanford h!

This might be a dumb question, but how do you know the system time would be expressed in nanoseconds?

I believe this was the code snippet the researchers commented on:

—————

// Use the clock for seeding.
const auto get_clock_seed = NOEXCEPT
{
const auto now = high_resolution_clock::now();
return static_cast(now.time_since_epoch().count());
};

// This is thread safe because the instance is thread static.
if (twister.get() == nullptr)
{
// Seed with high resolution clock.
twister.reset(new std::mt19937(get_clock_seed()));
}

—————

Ted • August 14, 2023 7:07 AM

@sanford h

Also sorry, “<uint32_t>” appears not to have made it into the comment field. Maybe it registered as an html tag?

Hope this works

——-

static_cast<uint32_t>(now.time_since_epoch().count());

——-

Clive Robinson • August 14, 2023 10:19 AM

@ Ted,

Re : Time ain’t what it used to be.

“This might be a dumb question, but how do you know the system time would be expressed in nanoseconds?”

You don’t and it probably is not, and a cast to uint32_t from either of a pair of 64bit ints may not give much of anything. Because the chances are one counts up in seconds and the other counts up in fractions of a second if on a Unix box. However on a MS Windows box it’s claimed to be 100nS increments.

First off,

I suggest you look in,the C99 or later standard documentation or explanation there of. Such as,

https://en.cppreference.com/w/c/types/integer

You will find that if supported uint32_t is an unsigned 32bit int and of general usage not specific to time (which is kind of what you would expect an RNG seed to be).

Secondly you need to remember time is a bit of a sore thumb issue in C and various other standards. That is few people agree on “the best way” and this has,not been helped by a long ongoing –since 2009 at least– argument as to if we should just drop leap seconds altogether not just in network time but “civil” wall clock time.

So you need to look it up in your specific compiler AND OS documentation.

Whilst the modern usage implies the minimum time slice resolution is in nS that is far from the actual case in most computers. In MS boxes this century it’s 100nS suposadly which is rather different to the original IBM PC 1573040 “ticks” per day from the keyboard / serial interface circuitry.

Originally the time returned under unix was based on the US mains frequency untill 1972, when they changed time to be of a type “signed 32bit” and incremented once a second since the Unix epoch of 00:00:00 Jan 1st 1970 (in sort of UCT for reasons to droll to go into).

If you look up “The Unix Epoch issue” or “Year 2038” or just “Y2038” you will discover there is an issue…

Put simply a signed 32bit counter realy has only 31bits of count. So at a rate of one count a second, come 03:14:07 Jan 19th 2038 UTC it will probably do one of three things…

1, Overflow into max negative time.
2, Overflow back to 00:00:00 jan 1st 1970 again.
3, Strange things like a system hang or worse.

Why one of three things?

Well because the actual clocks are “unknown hardware” thus “implementation” dependent…

Which is why any *nix system you come across could work differently which is always fun when porting code… And with luck Jan 2038 should –except in the case of a friend who will be 77 on that day and partying hard– pass peacfully by…

Unless of course you’ve got poorly ported software to deal with… Because *nix time and C times are not of necessity aligned (the Linux and BSD crowds do not see eye to eye, then there are the seldom used Unix standards, and a bad case of the grumps all round and people wanting to make it Intergalactic).

Worse yet unix time is generally not UTC or Posix compliant because of “leap seconds” that happen due to earth wizzing around the Sun less predictably than expected…

That is Unix time is always 86400 seconds a day where as reality alows for 86398 to 86402 depending on how you decide to implement things. There are several ways leap seconds can be dealt with so keep your eyes open at the right time to see which you have but I’m not going to go into it as it has more exceptions than old MS Code…

But somewhere along the line someone decided that 1sec time slices were way to long… So micro (10^-6) and nano (10^-9) Seconds were decided by various people as should be OK as the minimum time increment… After all light travels about a foot in a nS… But this was added in another int as a fraction of a second not a count. And also the seconds became a signed 64bit int as that should cover rather more than the life of the universe around the 1970 epoch, with leap times adjusted by lookup table.

Well the bad news is “relativity” want’s it’s say, and it’s asking for a lot finer than a nS if you are going places with any precision which by the way includes moving around the mobile phone network or manovering by GPS to the correct parking spot… Oh and don’t forget clocks slow down as their velocity goes up…

But back on Earth consider how many nS there are in an IBM PC “tick” at around 18.2065 ticks a second (thanks to Baud Rate standards that go back to the early 1960’s and earlier).

Just to confuse the world MS use the Unix Time format for the file system display but something else internally. Some sources say MS stores time as 100nS (nanosecond) time intervals/slices using a linear count in a 64bit int since an epoch of 00:00:00 Jan 1st 1601 GMT. Others say 12AM UTC…

So yeh all good fun, but in reality very little entropy.

sanford h • August 14, 2023 10:42 AM

@ Ted,

I’m a bit uncertain about the nanosecond thing, but see the documentation for std::chrono::high_resolution_clock in C++:
https://en.cppreference.com/w/cpp/chrono/high_resolution_clock

The result is in “ticks”, “with the smallest tick period provided by the implementation”. It could be from the real-time or monotonic clock, or another, but if it’s nanoseconds that doesn’t much matter: the low 32 bits of a nanosecond clock will repeat every 4.2 seconds.

Some web searching suggested that the C++ library’s high-resolution clock returns nanoseconds on Linux. That would make sense, because the modern POSIX function to get the time is clock_gettime(); that uses struct timespec, which has separate second and nanosecond fields.

The same web search returned reports of Linux users consistently seeing zeroes on the end of the tv_nsec value, which could reduce the entropy well below 32 bits. I, however, gave the system call a quick test and saw all digits used on an x86_64 system with a 6.X kernel.

(By the way, I’m intentionally phrasing my messages to minimize the use of HTML and avoid the need for literal angle brackets. The Preview button’s been a no-op since the switch to WordPress—it’s been years, so I doubt Bruce is working on it—and large parts of a message could disappear if I get it wrong.)

Ted • August 15, 2023 12:31 AM

@Clive @sanford h

Thank you both so much for sharing links to the C++ reference website!

I am enthralled! I’d love to add something constructive to your comments, but I’ve just spent the last 40 mins trying to figure out the most basic elements of date and time in a web-based C++ editor 🙂

Your ‘high res’ points of reference have been fantastic leads for further research!!

RobertT • August 16, 2023 12:22 AM

@Winter
Building a TRNG is far from easy.
If you look at any real world sensor there’s a maximum of 12 to 14 bits of sensor / adc (analog to digital converter) range. However, for anything done onchip there’s a maximum of about 8 bits (0.5%) absolute accuracy as set by the voltage reference (typically 1.2V).
Ok this says nothing about the state of the bits beyond the sensor/adc range, so if I make a 20 bit ADC I’ll get 20 bits returned and 8 of these bits could be just flipping about in a seemingly random fashion. But don’t be fooled into calling these bits “Random” because they’ll be far from random especially IF someone is intentionally injecting RF onto your supply (or as is more common these days, the switched mode supply that is used to power the chip creates a power supply signal which dominates the state of the lower order bits)
So I could make a perfect TRNG module only to see it get used on a chip powered by a noisy supply and guess what, my perfect randomness would be completely undone and replaced by a somewhat random beat sequence generator. Almost nobody would pickup on this fault because the output would pass all available tests.
When someone says they want 32 bits of true “randomness” as a single measurement of anything they clearly don’t know what they’re talking about. It’s a silly statement.
About the best you can hope for 4 bits of randomness per measurement. All “randomness” over and above this is generated by successive polling of the random source to get another 4 bits and another 4 bits and so on.
In the real world most TRNG modules have a single bit output and get polled 128 times for your 128 bit random number. BUT as I mentioned earlier power supply noise is the noise source that often dominates the measurement, so what you actually get is a beat frequency of the power supply switching frequency noise beating with the microcontroller sample frequency. BTW It’s very hard to make a TRNG module that is insensitive to power supply noise.
My personal preference is to use a fully differential 4th order self sigma delta modulator to make a sort of chaotic oscillator. But this is useless information because nobody really cares, (as in they’re not prepared to pay one penny extra for this functionality) and even this structure is inclined to lock to the power-supply switching frequency.

Even if you use the structures that I’m suggesting you are kidding yourself if you believe that your assembled “random number” isn’t just some variant of a jittery beat sequence. This means that it is highly likely that certain bits in the sequence will be highly correlated with other bits, how many bits is hard to say, the only test that this might fail is if the resultant “random number” has spurs in the Frequency domain.
OK so to get rid of these spurs most TRNG’s will incorporate a sort of PRNG (possibly done in firmware or maybe a complex LFSR) which whitens the raw output of the random source. In reality this stage adds almost no entropy because it’s just another unexpected sequence. It’s all unbelievably complex but it’s no more random than what you started with.

All the modern standards favor PRNG’s and so do most security researchers (probably because they understand CSPRNG’s) but real randomness only comes from the TRNG which very few people properly understand and to make matter worse the available TRNG test’s are frankly just a joke.

Clive Robinson • August 16, 2023 9:24 AM

@ sanford h, Ted,

Re : Low bits are not random.

“but if it’s nanoseconds that doesn’t much matter: the low 32 bits of a nanosecond clock will repeat every 4.2 seconds.”

As I noted above the current *nix time uses two 64bit signed integer accumulators. The first is a count of seconds either side of the epoch. The second is the fraction of a second as derived by the system “tick”.

Let’s just say the system tick is 1/1024 or about a millisecond that means that of those 64bits, only ten will change, and they will be in the top bits not the bottom bits.

Have a look at what the “C Standard” says about “casing” a long long signed int to a long signed int.

In the past it was simply a case of mask of bits then it became alow for the sign bit…

The point is you need to see if you are actually getting any entropy from the system tick at all.

Winter • August 16, 2023 11:56 AM

@RobertT

If you look at any real world sensor there’s a maximum of 12 to 14 bits of sensor / adc (analog to digital converter) range.

It depends. HiFi sound input, or a video chip, gets more bits per sample. FLAC and associated video compression will remove a lot of redundancy so you can get rough estimates of the actual entropy you record.

You will have to use very conservative estimates and nothing is quality assured. But there is a lot of entropy in sound and video. And it is very easy to check whether it is “sound” data.

sanford h • August 16, 2023 1:10 PM

@ Clive Robinson,

The point is you need to see if you are actually getting any entropy from the system tick at all.

In principle, sure. At first glance, tv_nsec also looks okay in hexadecimal, though perhaps with enough maths one could prove that this cryptosystem would have only 14 bits of entropy instead of 32. In practice, it hardly makes a difference: it’s probably faster and easier to brute-force the whole space than to try to reduce it.

Now, if we were to find a system that appeared to have only something like 80 bits of entropy, that’s when we’d want to start some fancier analysis to decide whether it falls into the category of “of course your money’s already gone”, “you’re screwed as soon as someone who knows FPGAs gets interested”, or “we’ll get supercomputer time if your balance is large enough”.

Clive Robinson • August 16, 2023 5:23 PM

@ sanford h, ALL,

Re : Finding signals.

“though perhaps with enough maths one could prove that this cryptosystem would have only 14 bits of entropy instead of 32.”

This is where FFT and FWT become your friends and you do a spectral and sequency analysis respectively.

However also using a cheap “genuine analog” oscilloscope can also help.

Consider two free running ring oscilators feeding a D Type latch one to the CLK the other to Din. If you look at the Qout on the scope, close in with the time base on a high trace rate the it looks realy random to the eye.

However turn the timebase down and eventually –with an analog scope– see that you get a pattern where things bunch up then spread out bunch up spread out on a regular basis. This is a major clue you have a problem (and you might not see it on a modern digital scope due to sampling issues).

If you make a simple lowpass filter with a resistor and a capacitor and put Qout through it, you will see a near perfect sine wave, with a frequency that is the difference frequency between the two ring oscillators.

If you use just software via a counter that then drives a D2A converter you get a very pure sine wave in digitized form (it’s how I used to make high quality low frequency signal generators for modulation testing).

Feeding Qout into an FFT or FWT will pull these signals out with very little dificulty.

The point is what looks random to the eye or simple statistical tests like the DIEHARD Battery tests very often is not.

The reason I mention it is it’s also tied up,with @RobertT’s reply to @Winter,

https://www.schneier.com/blog/archives/2023/08/cryptographic-flaw-in-libbitcoin-explorer-cryptocurrency-wallet.html/#comment-425562

Many TRNG’s on chips are based on “Ring Oscillators” fed into D-Type latches… I can tell you now that the NSA et al are well aware of it’s failing.

So Intel for instance use “magic pixie dust” thinking and take the Qout and shove it through a clocked shift register or similar to get a digital word they can shove through a crypto algorithm.

If the data sheet says it’s a Hash Function then you might make the mistake of believing them… If howrver they use AES and a key they know or one from a very limited range, then striping the encryption is very easy to do and also very fast. Then synchronizing to the ring oscillators is not exactly difficult from then one thus find a very limited search space to check…

It’s why I say,

1, Do not use a TRNG unless you can get to the source output before any dibasing or whitening.
2, Always do extra encryption using say AES-256 in some kind of mode on the manufacturers TRNG output using a random key.

The fact that the first is not possible with most On-Chip TRNG’s should be not just telling you something but screaming it at you and waving a big red flag.

Back at the end of the last century our host @Bruce designed an “entropy pool” called Yarrow. Variations on such entropy pools are a very sensible precaution when you have no choice but to use a TRNG output after it’s been whitened.

Doing it correctly can give you 512bit or better entropy given sufficient time to accumulate it.

However it was felt that Yarrow could be improved upon so @Bruce teamed up with two others and Fortuna was the result,

https://en.m.wikipedia.org/wiki/Fortuna_(PRNG)

https://www.schneier.com/wp-content/uploads/2015/12/fortuna.pdf

It’s described as a PRNG which strictly it’s not as a TRNG input is used thus predictability should be broken if the correct proceadures are followed. As such it is a CS-PRNG –block cipher in CTR mode so effectively a stream cipher generator– that is modulated –re-seeded– by an entropy pool driven by what is hoped / assumed are sources with actual entropy.

There are others around, I designed a system for embedded microprocessor based systems back last century. It was based on the ARC4 stream cipher driven by a BBS generator that was modulated by an actual TRNG source based on the “roulette wheel notion” where a noisy oscillator is sampled by a stable oscillator. I’ve described it in more depth years ago on this blog, so you can hunt it out and use it if you wish.

However if you are using a 64bit CPU that has decent AES hardware in it as many do these days then I’d go down the Fortuna route as it has been examined by other cryptographers of standing.

RobertT • August 16, 2023 6:31 PM

@Winter, believe it, or not, but I happen to know quite a lot about Audio and Video ADC’s and the noise characteristics of the systems they’re embedded into.

Sure you can buy a 20bit Sigma Delta Audio ADC and massively gain-up some noise source at the input to get instant Random Noise but what you’re missing is the low-pass digital filtering stages that happen between the raw ADC bit stream and the filtered 20bit Audio bit stream.

The raw output of a Sigma delta will usually show frequency spurs (FFT of the raw data stream) resulting from a mixing of the sigma delta sample clock with the power supply switching noise (or self-chaotic-oscillation in the case of a higher order sigma delta)

You just never get to see this because it is removed by the digital low pass filters. The actual modulator will often be clocked at 1Mhz (or higher) whereas the Audio band is only 20hh to 20khz, so raw output is 50 times oversampled.

Video ADC’s tend to be done differently, usually some variant of a Successive Approximation ADC and 14bits is about as good as it gets. Interestingly one of the dominant noise sources for this type of ADC is the capacitor array (most 14 bit adc’s use an of half Resistor, half capacitor array) The resistor noise is typical thermal noise, but what’s interesting is the Capacitor noise which is kT/C noise.

Kt/C is a very useful on-chip noise source and that’s all I’ll say on the topic.

Winter • August 17, 2023 6:52 AM

@RobertT

Sure you can buy a 20bit Sigma Delta Audio ADC …

You make it sound like we are still limited to plain old telephone 8bit, 8kHz sound.

A cheap Edirol R09 already offers 24 bit, 48kHz sound recording with very nice stereo microphones. No power interference etc. as it works on batteries. Dynamic range is excellent.

But we are talking past each other. I never said we could get 1Mb/s perfect white noise. I said we can easily collect entropy from our environment to seed key generation. And if that is by 8bit, 8kHz sound, it just means you record a few seconds longer.

RobertT • August 17, 2023 6:17 PM

@Winter if you’re really interested in this topic then read up on
“Noise shaping” and “Sigma delta modulation”
while you’re at it pay close attention to any reference made to “dithering” especially within the second modulator stage.
It might just be noise but wrt TRNG’s getting the basics right is kind of important.

Winter • August 18, 2023 3:50 AM

@RobertT

@Winter if you’re really interested in this topic then read up on

More of a hobby. I spend more time getting rid of noise or working around it.

But I have used calibrated noise sources in the past. I quickly decided life is too short to cope with the hassle.

Steffen Schaumburg • August 27, 2023 5:07 AM

I can’t decide what’s funnier – the bug (or backdoor, if intentional) or the delusional denials by developers. Or that people would trust money they can’t afford to lose into such software and generally crypto “currency” schemes in the first place 😉

Schneier on Security

Cryptographic Flaw in Libbitcoin Explorer Cryptocurrency Wallet

Comments

Leave a comment Cancel reply