Abstract: in this document I present a proof that
several of the ambitious claims by Bitcoin Knots supporters are not
true. Most notably, the claims about contiguous data. To prove this I
stored over 66kB contiguous image file in the Bitcoin time
chain without using OP_RETURN, Taproot or following
policy rules and, to highlight the point, the image file is itself
an entire transaction. This document also explains how anyone can
verify this claim themselves using their own node.
Important update
I listened to feedback and also made a transaction that is BIP-110-compliant! Go to the end to verify this.
Contiguous data
The main claim of the Knots supporters is that whether the data is stored on disk interrupted by other data or not (contiguous) has any bearing on whether it is legal. Therefore, they claim, we have to decrease the size of every bit of data Bitcoin stores to ensure nobody can do anything like that. The idea is already ridiculous. Imagine this in court:
- Judge: It is alleged that you stored illegal content on your device.
- Defendant: Yes, but it was split up in parts.
- Judge: OK, not guilty.
However there is a way to show this idea is even more nonsensical
than it looks. Despite Bitcoin already supposedly not allowing more
than 520B of contiguous data except for transaction outputs whose
scripts start with OP_RETURN, I managed to make a
transaction that, when interpreted as an image file, can be processed
by readily available image viewers to show an image of Luke-Jr
crying:
Yes, that’s right. Not only is that image in the time chain and not only is it contiguously stored but also the entire transaction is an image, so you don’t even need to skip the initial bytes.
In addition, the transaction contains contiguous ASCII text longer
than 520 bytes without using OP_RETURN and it even
satisfies the rules imposed by BIP-110. That means, if you only
validated that text using BIP-110 logic the transaction would
pass.
Verification of the proof
Using your own node
Here’s how you can verify it yourself on your own node using command line:
bitcoin-cli getrawtransaction b8cc570ef453508b6c6a1758bc591c276abdbe2b88881ae487ba4e858bd8d4be | xxd -r -p > luke.tiffIf you use your node on the desktop you can now directly open
luke.tiff file in your home directory. If you use
ssh to access your node in the form of
ssh yourusername@yournode you need to first copy the file
to your computer using e.g. scp:
scp yourusername@yournode:luke.tiff luke.tiffNote the colon between the hostname and the file name.
Then you can open it on your computer.
By looking at man xxd you
can verify that the only thing xxd -r -p is doing is
reversing the hex encoding bitcoin-cli uses. It does not
do any skipping of bytes or seeking in the file. (While
xxd itself does support seeking, that command is not
doing so.) Further, you can look at the output of
bitcoin-cli getrawtransaction to see that it does in fact
spit out a lot of hex characters that look like what an image file
might be. Or a transaction, depending on how you look at it. To
strengthen your confidence about the command you can also ask around
some programmers, including non-bitcoiners, who know a bit about Linux
to confirm that this is true. Feel free add an LLM into the mix. And
if you think I’ve just backdoored the image viewer (nice of you to
think I’d be able to do that!) you can try opening the image on some
old computer that didn’t receive updates for a year. Or try to get an
archived version of an image viewer. Or just use multiple devices to
see the image.
Additionally, you can extract the accompanying text using:
bitcoin-cli getrawtransaction b8cc570ef453508b6c6a1758bc591c276abdbe2b88881ae487ba4e858bd8d4be | xxd -r -p | tail -c +45889 | head -c 4390This does obviously seek in the transaction and chops off the unneeded tail but it is still contiguous.
Using mempool.space
If you don’t have a node and trust mempool.space to not cooperate
with me to bamboozle you you can use their service to download the
transaction directly in the binary form and save it as a tiff
file. Just right click this
link and download the link. Make sure to save the downloaded file
with .tiff extension. Then open it normally as any other
file.
If you copy the link and inspect it you will see that it points to a transaction with the same txid as mentioned above and that raw (binary) transaction mode is requested.
The claims debunked by this transaction
It matters whether the data in the transaction is contiguous or not
As you can see, despite the consensus rules not allowing contiguous data I still managed to store contiguous image file. Messing with consensus parameters won’t change it, only make it take a day or two longer to workaround it.
Use of
OP_RETURN is required to store contigous data
You can inspect the transaction using:
bitcoin-cli getrawtransaction b8cc570ef453508b6c6a1758bc591c276abdbe2b88881ae487ba4e858bd8d4be 1 | lessNote the 1 at the end; less is
optional, just to not spam your terminal.
And see for yourself that the transaction doesn’t have a single
OP_RETURN opcode anywhere in its body. You could
completely ban OP_RETURN and it’d still be valid. You can
double check this with any transaction analyzer and it will say the
same thing: there is no OP_RETURN.
Taproot brought in a vulnerability that allows storing large data in the chain, this was not possible before
This claim is disproven by the transaction NOT using Taproot. This
can be verified too but to do that you need to retrieve the previous
transaction because it is always the scriptPubkey in the previous
transaction that dictates the rules of spending. Using the above
mentioned command, you can see the txid of the previous transaction
specified in the "txid": field within the
"vin" array. Note that each input has the same previous
txid. Use this txid to get the previous transaction:
bitcoin-cli getrawtransaction PUT-PARENT-HASH-HERE 1In its outputs you will see that each one has:
"type": "witness_v0_scripthash"
Which is NOT Taproot as taproot would be v1.
Notably, the practical effect of the technique I used increases the processing cost on the nodes compared to inscriptions. So if you were to ban Taproot people could just switch to my technique and the situation would become worse.
OP_IF is
required to store large data
The transaction doesn’t use a single OP_IF anywhere in
its scripts. In SegWit v0, the script is stored as the last element of
the witness - thus you can use the command above to get the script for
each input (there are only three). Then use this command to decode
it:
bitcoin-cli decodescript PASTE-THE-SCRIPT-HEREYou will see many opertaions but none of them is
OP_IF
It’s difficult to get non-standard transaction in
If it is the case then why is BIP-110 proposing a consensus change?
Anyway, this transaction is non-standard mainly because of the
strange version number required to make the entire
transation look like a valid image file. It also likely blows
through some limits but it still uses techniques that can
bypass those limits by placing the data in multiple inputs in a way
that still makes it contiguous. If being contiguous is what’s
relevant here, the point still stands, though one would need to skip
first n bytes of the transaction (depending on image
size).
I do admit that, based on my experience, putting a non-standard transaction which is in its entirety an illegal image would be significantly harder than what I did but it doesn’t solve everything. It’s a bit more involved:
- If one wants a small (< 1kB) non-standard transaction mined this takes like 2 minutes to figure out. (However, I currently don’t know of a specific solution to encode an image in this way.)
- If one wants a large transaction with illegal contiguous content, the best way is to simply not try to make it an image in its entirety, so that it can be standard.
- Large non-standard transaction wiht legal content can be done as seen here, it just takes a few days of waiting to get it in.
In any case, you’re not getting any legal effect by messing with policy filters and you might give an unfair advantage to the big miners instead. It’s really funny that people behind Ocean pool are promoting something that will hurt them in the long run.
This technique is still too hard to do
It took only one day to write the software that takes an image and embeds it into a transaction contiguously. This still required skipping the initial bytes. The more interesting version that makes the entire transaction a valid image file took two more days to code. Notably, this didn’t require crazy cryptographic, computer science or image processing skills. It was all just stupid fiddling with some constants and slightly modifying a readily available Open Source image library.
The only area I had an advantage in is knowing Bitcoin really well. But even there, 99% of it was just encoding rules, which are easy - literally the first programming-related thing I learned about Bitcoin. The rest was just the consensus rules knowing what data belongs where.
Any programmer capable of understanding basics of Bitcoin should be able to do this, even if potentially a bit more slowly. It might be even the case that hand-assembling the transaction would’ve been easier.
Notably, now that I have the software ready, I could produce many more such transactions. It runs in about 0.25s on a single core on my old machine and the code isn’t optimized much. So I could produce 16 images per second on a 4-core machine.
Why did I do this? Potential questions and answers.
Spammeeeeeeeeer!!!! You’re just a bad actor attacking Bitcoin
Honestly, I really, really, really didn’t want to do this. I have a track record of speaking against spam, against increasing the block size, bashing shitcoins like Monero that instead of removing data from the chain put more garbage into the chain to mask them, while praising Lightning Network and Taproot for doing the opposite. I really hate spam and I don’t want my node to process garbage. Including transactions themselves. Doing this kind of felt like stabbing myself.
However, there’s something I hate much more than spam: Untruths. Or lies, as Luke likes to call them. I tried arguing about this in the past, showed a contiguous image encoded to fit into witness and yet, the Knots supporters are still saying the same stuff over and over as if it wasn’t debunked. I felt that the argument is hard to convey theoretically as I did before.
I did consider using a test network but that has the same problem: people who are not too deep into Bitcoin will consider it “just theoretical”. Thus I came to conclusion that it has to be the main chain - no theory, no “what if”s, no guessing. Just a plain transaction that anyone can verify on their own node. No complicated tools, no seeking, no strange programs. Just readily-available software.
A situation that will make anyone not believing this transaction really exists look like a LARPer for not running their own node or being too dumb to verify and run a command. Something so strange that people will hopefully want to share it. Something that can be verified. Just like Bitcoin itself.
Is it enough to justify the cost? Honestly, I’m not really sure at this point. But at the moment of writing I think it’s a very powerful argument and worth a shot. If you’re annoyed about it maybe, just maybe, you should’ve listened to Tone Vays saying “Now there will be more spam on blockchain because people will spam just to troll you.” Yep, he was pretty much right, though this is definitely not “just trolling”. Oh, speaking of him, you should watch more of his stuff. Seriously.
For me this is just a one time only project. I won’t be putting any more images into the chain. Maybe, just maybe, except a BIP-110-compliant version if people continue spreading nonsense. I didn’t decide on that yet.
Finally, this is the rare occasion when I’m not publishing the code on the Internet. I want this entire thing to be a one-time example and not a new wave of NFT shitcoins. If shitcoiners want to code it they will have to work for it, I’m not going to help them. The code is also the biggest garbage of a code I ever made (obviously, it is single-use), so no need to worry that if it leaks accidentally it will be usable by anyone but me. :)
Then why do you support Core?
Simply because the various limits proposed by the Knots supporters don’t work. The spammers will always find a workaround. Just like I did. And even worse, almost all of these workarounds, in addition to putting the spam in anyway are causing other problems, including increasing the amount of spam.
Take for example a spammer that wants to put 100B of data into the
chain. They won’t fit the 83B limit proposed by BIP-110, so the
spammer splits them up to two parts. One part is 32B long and pretends
to be a P2TR public key and goes to one output, the rest goes into
OP_RETURN. But now the transaction has one more output.
Each output contains amount, which is 8B long and a
scriptPubkey which is at least 1B long (to encode its
length) and in this case it’s 35B long: 1B for length, 1B for version
number, 1B for push opcode and 32B for the data. If we count up all
the additional data it’s 11B. Thus the data the spamer wants
to put in is now 11% larger. The fee rates today make it cost about
1-2 sat, maybe 20 sats on a bad day.
But even worse, the extra data is now in the output
forever. The data in scripts that begin with
OP_RETURN is never even stored in the UTXO sets and
pruning nodes can get rid of it entirely. The data in other outputs
need to be stored in UTXO set - a database that needs to be always
accessible for block verification, cannot be pruned and requires
additional data for every output, namely:
- The transaction ID - 32B
- The index of the output in transaction that created it - 4B
- The height at which the output was mined, so that relative time locks can be verified - 4B
- A flag determining whether the output is coinbase or not - theoretically 1b but practically 1B.
That’s 41B more, so in total. If we count it up with the entire output data that also needs to be stored in UTXO set it becomes 41 + 8 + 35 = 84B. Now add to it the 11B that’s in the chain and you get to 95B. So with the size limit, your node has to store 95% more data while the spammer only pays 11% more fees for it.
So this doesn’t deter spam, it increases it. This is also the exact
reason why there is a non-witness penalty (misleadingly named as
witness discount) - data in the outputs is more harmful than in the
inputs. And this is the exact reason why the Core developers were in
favor of increasing the standardness limit on scripts beginning with
OP_RETURN from 83B to ~100kB - to remove the motivation
of spammers putting even more garbage on the chain. Oh yes, while
we’re at it, another lie that Knots supporters like to spread that the
limit was removed - it was just increased to a much larger number.
But why have spam? Why not just stop it?
It is literally impossible. It’s like trying to fly by outlawing gravity. There are so many ways of putting data into the chain I can’t even count them.
But as BIP-110 say, it sends signal that spam is not welcome!
First, spammers are not pussies that care about your signal. Second, you think you’re sending “you are not welcome” signal. The signal I’m perceiving is “Here’s someone claiming you can’t put contiguous data on the chain, would be real fun challenge to prove them wrong and make them look like clowns by doing exactly what they claim is impossible.”
I’m just a normal non-programmer and this entire debate confuses me, what can I do?
Try to find any independent programmer outside of Bitcoin who you can trust and ask for help. It works, I know someone who did exactly that with a good result.
You have hurt Bitcoin, now my node will have to process this garbage
If a single person can significantly hurt Bitcoin then Bitcoin is already doomed and it better die sooner than later so we don’t waste time with it. Sadly, your node will have to process that garbage but at least not store it forever, unlike the unspendable outputs created by the people who were incentivized to make them by the annoying policy rules.
If Knots proponents did actually read the original
OP_RETURN proposal on the mailing list and honestly
engaged with the arguments, potentially even debunking it, instead of
just screaming past it or lying to non-technical people about its true
nature I would not have the motivation to do this project and your
node would now process almost 67kB less than what it does now. Think
about that the next time you engage in a dishonest discussion.
SegWit discount is still a problem, you even used it yourself!
No, just because I didn’t feel like paying 4 times as much for the transaction doesn’t make SegWit a problem. I could still do it without it, but more importantly, it’s a misnomer for non-witness penalty. Removing the penalty would remove the incentive to not put data into the output scripts - see above. It could conceivably be the case that without the penalty one would find it easier to put the data into unspendable outputs that will be in UTXO set forever without the ability to prune them.
Here’s some constructive feedback for the Knots supporters: just drop BIP-110 and instead try to find consensus for a temporary max block weight decreasing soft fork, just like Luke proposed several years ago. I would be inclined to support this if it has sufficient consenus. Though, I do strongly doubt it would get consensus given the block size wars drama, at least it’s not based on obviously wrong arguments.
BIP-110 still makes your transaction invalid
Only the specific mainnet transaction. I created another transaction for the BIP-110 regtest which is BIP-110 compliant. And as you can see, the transaction got larger, so that means more spam would end up in Bitcoin because of BIP-110. I believe I also found a trick to workaround some other limitation on the resulting image size but I’m not in mood to implement it now.
Curiously, in some cases the 256 limit has no effect. Putting a 504B long element into the witness (520 is the current limit) is more costly per byte than splitting it up into two pars of 252 bytes since the former requires three more bytes to encode the length while the latter requires 2 more bytes - one for each element. This holds for all values between 253 and 504 inclusively however it is again more advantageous to split 5200B into 21 elements of up to 252B requiring 21B of overhead rather than 10 elements requiring 30B of overhead. However, practically speaking, one might want to prevent miners from modifying the data and that requires at least 23B per push. So decreasing the limit from 520 to 256 would lead to increase of overhead from 5% to about 19%.
Yes, the size increases but also the spammers will have to pay more which disincentivizes them
Do you really think that someone willing to pay millions for monkey pictures will be realistically demotiated by a few thousands of sats? Heck, I don’t have a monkey picture to sell or buy for millions and just the mere annoyance with the lies of the prominent Knots supporters made me willing to pay for more than 100% overhead. (Yes, my transaction could’ve been much, much smaller if I didn’t want to also make the entire transaction into an image file.)
The irony is, to make the spam more expensive the fees need to be higher and for the fees to be higher the blocks need to be fuller and for the blocks to be fuller there needs to be more data that nodes process. Well, technicaly a block size limit decrease soft fork would also do the job but that might be very unpopular.
But we still need to obey the government! If they say bad pictures on nodes are illegal we have to change our node software.
If you listen to the government when it says “change your node software because of bad pictures” you will listen to it when they say “add inflation to your node”, “add censorship to your node”, “turn your node into a spyware”.
The entire point of Bitcoin is that the government can’t enforce such commands because they don’t know/can’t prove you’re running it. If they can, you have a spyware problem - a much bigger issue than illegal pictures on the chain. The irony is that being able to run Bitcoin covertly is part of the reason behind smaller block size limit.
Running software that is inherently designed to resist government and then insisting on changing it based on government wishes is quite something.
You still had to hex-decode the transaction!
And, in case of bitcoin-cli you also have to
hex-decode it in case of an output containing a script beginning with
OP_RETURN. The data is still stored and transmitted in
non-hex form for a simple reason that hex would take up exactly twice
as much data. Instead of 767GB it would take 1.5 TB.
This is obviously not happening, though I heard someone non-technical say it is so. I think it was a honest mistake.
Anyway, if hex-encoding would be a good argument for this then it is equally good for other kinds of data.
You’re
just misinterpreting the Bitcoin data, Bitcoin only supports arbitrary
data in OP_RETURN outputs and nowhere else
Why does BIP-110 try to restrict the length of any other data then? Also, who’s to say that this isn’t an image and it isn’t Bitcoin that is just misinterpreting an image file as a transaction? I did in fact construct most the transaction using an image encoder, not the other way around.
The image on this website is not the transaction itself!
As it shouldn’t be because you shouldn’t trust me when I claim that I put that stuff into the chain. Just as you shouldn’t blindly trust the developers (Core or Knots). This is Bitcoin - don’t trust, verify. Go and get it from the chain yourself.
He used
OP_SHA256, let’s ban that!
Yes, do! You will break the entire Lightning Network and all the coffee transactions made on it will now have to be processed by your node!
We need to ban transaction version 536870912 now!
That will only stop the more interesting variant of this transaction. It will not stop all possible ways this can be achieved and definitely won’t stop the simpler method where one has to seek into the transaction but the data is still contiguous.
Aren’t
you the guy who found a bug in the OP_RETURN PR?
I’m the guy who found an ambiguity in the PR and pointed out that the meaning of the parameter could be interpreted either way. I don’t decide which meaning is the correct one. This is a normal thing in software development when sometimes it’s not clear whether something is a bug or intended. Luke spun it as a bug because he thinks the meaning is clear but it is objectively not clear because the explanation that the parameter means the data stored, not the overhead also makes sense given the parameter already ignored the overhead.
Update - the BIP-110 compliant variant
Huuuuge thanks to Luke-Jr for alerting me to the fact that I previously misread part of BIP-110. His hint helped me avoid wasting an hour or two tracking down a “weird, nonsensical bug” and as such I could create this BIP-110-compliant transaction sooner.
I got a lot of criticizm/feedback that “BIP-110 works because it stops that specific transaction”. This came from people who don’t understand that just as Knots can change a constant in their code, so can I. I appreciate though, even if one understands the argumen, actually generating a BIP-110 transaction is much more convincing than claiming “I will change a constant”.
However, there is a challenge with proving that my transaction is BIP-110-compliant: I cannot do it on mainnet because the mainnet isn’t enforcing the rule yet! Even if I broadcasted it on mainnet today nobody could prove that it is compliant unless the fork was already activated. But the point of this discussion is to have it before it activates. Plus I often don’t like waiting for half a year.
Verification of the proof
So the only way to prove a transaction is BIP-110-compliant is to
test it against the working BIP-110 implementation using a test
network - the best for this is regtest. From the official BIP-110
website we can navigate to GitHub and verify that their regtest
implementation enforces the rules. In the file
kernel/chainparams.cpp we can find the section handling
regtest by searching for the REGTEST keyword.
Scrolling down a bit we can find this line:
consensus.vDeployments[Consensus::DEPLOYMENT_REDUCED_DATA].min_activation_height = 0;Activation height being 0 means the rules are active
right away without any mining - this is a usual thing to do, we can
see a similar thing above with Taproot, there’s even a comment
explaining it.
Therefore we can just use a knots release in regtest mode to check the transaction.
IMPORTANT WARNING: I have not verified that the knots fork that claims to enforce BIP-110 is not actually a malware that does anything other than enforce the BIP-110 rules. For security reasons you should NOT run it on any computer you have private data on without proper isolation, such as virtual machine. I AM NOT RESPONSIBLE FOR ANY DAMAGES THAT YOU MAY EXPERIENCE BY RUNNING THE KNOTS SOFTWARE. DO IT AT YOUR OWN RISK!
To make things easier, I wrote a script that you can verify what it does and then run it. I know it can be a lot to ask for some people but
- The script is very simple and not too long, it consists of only a few basic steps.
- The script is heavily commented explaining the purpose of each command.
- It should be easy for most Linux users (and possibly advanced macOS users) with a bit of experience to verify it. You can just get a neutral friend to check it for you.
I packaged the script together with the transaction in an archive for easier handling.
- Download the archive
- Unpack the archive
- Open the
luke.tifffile to verify it’s a contiguous image file - Read the script and verify that it’s doing what I claim it’s doing
- Open the terminal and enter the
contigous-data-in-bip-110-verifydirectory - the script will not work without it! - Run the script according to the instructions in its top comments.
E.g. in a fresh VM you can run
./verify_transaction_is_bip110.sh --download-knots
If what I’m saying is true you will see the script successfully mine a block with that transaction included. (The last command will output the hash of the block.)
Acknowledgment
Thanks to the person who helped me generate the image and to everyone who provided any feedback and help with this project!