Blockchain is a distributed ledger technology used to store transactions and maintain a shared, immutable, and secure database. Cryptography is employed to secure and verify transactions, as well as to control the creation of new blocks in the chain. The distinguishing feature of blockchain is that it is decentralized, meaning it does not rely on a central authority to operate. It is primarily used in cryptocurrency applications, but also in the recording of documents, data, and various types of information for which certification and authentication methods are applied.
In Figure 1 below, one of the fundamental principles of Blockchain—the actual "chain of blocks"—can be observed. Each block is a logical structure that stores information about transactions, documents, or data (depending on the purpose). Blocks contain an identifier called a "hash," which is in fact an alphanumeric string generated as the result of applying an encryption algorithm (such as SHA-256 or MD5) to a given input text. For example, the complete date or timestamp of the block’s registration, the first and last ten words of the text to be recorded, or any other element intended for registration may serve as input. The hash acts as the block’s unique key, ensuring no two hashes are identical within the entire system. It also serves as a security mechanism, guaranteeing the integrity of the data stored within the block. This is because any modification made to the block’s content after its creation would result in a change to the hash code. Consequently, the original hash no longer matches the modified content, thereby enabling detection of any unauthorized alteration.
Figure 1. Basic schematic of the blockchain. It illustrates how blocks are linked together sequentially through their hashes, forming a fixed chain of registered data while remaining open to the addition of new blocks.
To create the blockchain, each block must reference the hash of the previous block, thereby linking them sequentially. Obviously, any modification to a block will cause a break in the chain, which can be easily detected. This endows the Blockchain with security and data integrity properties that are highly suitable for the archiving and preservation of information.
On the other hand, it is essential to understand how the information in the blockchain is stored. It is a distributed network, conceptually very similar to peer-to-peer (P2P) networks. This is a network of nodes that share identical records of the blockchain. In other words, each node holds exact copies of the entire chain as it is being generated. Thus, any attempt to modify or alter the chain or its blocks is immediately detected and corrected, since copies exist on all nodes. This means that to compromise the information, an attacker would need to simultaneously attack all nodes in the network and alter the blocks concurrently. This makes the security of the Blockchain even greater—not only due to the volume or scale of static and dynamic nodes in the network, but also because of the system’s operational properties, which continuously verify the consistency of hashes across all nodes in the network.
To understand Blockchain technology, one must address mining software. This is a computer program designed to reproduce a complex mathematical algorithm that generates a valid hash for the block intended to be added to the chain. This task can vary in difficulty depending on the number of nodes in the network, ensuring a consistent rate of block publication within a reasonable time frame (typically a few minutes). The first miner (node running the mining software) to successfully compute the correct hash usually receives a reward in the form of cryptocurrency, thereby limiting the introduction of liquidity into the system and, in theory, decoupling it from speculation in cryptocurrency issuance. Once the hash and the responsible miner are recognized and validated, the information propagates across the entire network of nodes, consolidating the new block. As the reader can imagine, blockchain chains are intrinsically linked to cryptocurrencies, since their design implies a computational cost for recording data, transactions, or documents on the network, necessitating incentives to sustain the node network and enabling the creation of a self-managed economic ecosystem.
Limitations
As with any complex system, there are inherent limitations, such as scalability, operational costs, transaction processing speed, regulatory challenges when blockchain is used for cryptocurrencies, and interoperability with other currency systems.
- Scalability. One of the problems with blockchains is their scalability, as the network of nodes, data, and transactions to be processed expands, the amount of resources required to compute hashes and maintain constant verification of blocks becomes increasingly greater.
- Cost. Mining on a blockchain and data storage can be expensive due to the need for energy and computational resources, which is not always efficient or profitable at a small scale.
- Speed. Transactions on a blockchain can be slow due to the need for nodes in the network to verify hashes. However, this can also be controlled depending on the difficulty level of the hashing algorithm for miners. In other words, reducing the difficulty of computing the correct hash increases the processing speed of blocks, accelerating chain growth and reducing consolidation time. However, rewards for network miners are reduced.
However, it is possible to create a blockchain without needing to link its existence to a cryptocurrency; for this, a stable network of nodes complying with the previously described security, identification, and immutability procedures will be required. This could be the most suitable case for the majority of uses that might be required from a library and information science perspective.
File Storage
The storage of textual or alphanumeric content via hash encryption is the typical procedure for storing information on the blockchain. However, this method is highly limited when it comes to storing larger files in office formats, such as PDF files and all types of multimedia content. In these cases, two options are possible: a) Encoding the files into an encoding or encryption scheme (e.g., Base64), in which case a very long character string is obtained; or b) Storing the file’s hash based on its name, timestamps, author, metadata, specific textual markers at designated positions within the document, the permanent storage link, etc. This latter option allows for significantly smaller hash sizes without compromising the ability to identify the document and verify its integrity. However, the actual file itself would not be contained within the block; instead, it would require a separate storage repository, referenced by its own hash. In this way, if an attempt is made to duplicate the document and store it in a different repository, the hash will not match, thereby revealing any change in location or integrity of the document. Thus, documentation referenced from the blockchain can be securely linked to servers or networks distinct from the blockchain itself. Network nodes can therefore verify the integrity of the file by comparing the hash stored in the blocks with the actual file and its location.
Hash Encoding
Hash encoding is a straightforward task, as most programming languages provide functions that automate the process. For instance, in PHP, there is the «hash» function, which can operate with major encryption methods (sha, md, ripemd, tiger, crc, gost, snefru, fnv, haval), compatible with any blockchain intended to be developed.
In the following example, a PHP function named «createBlockHash()» is shown, which takes the variables $index (block number), $timestamp (timestamp), $previousHash (hash of the previous block), $data (may contain transaction metadata, text, or content to be recorded), and $permalink (permanent link to the content) to generate the block hash encoded in SHA-256. However, other relevant or representative data may be incorporated to configure the hash, such as the author’s name, email address, words located at specific positions in the text, the author’s public keys, etc.
<?php
function createBlockHash($index, $timestamp, $previousHash, $data, $permalink) {
$blockData = $index . $timestamp . $previousHash . $data . $permalink;
return hash("sha256", $blockData);
}
$index = 1;
$timestamp = time();
$previousHash = "0";
$data = "This is the first block of the chain";
$permalink = "http://www.domain.org/permalink";
$hash = createBlockHash($index, $timestamp, $previousHash, $data, $permalink);
echo "The block hash is: " . $hash;
?>
Subsequently, the data for the various variables are entered and the hash calculation function is executed, resulting in the $hash variable of the block to be added to the chain.
Example of a Blockchain in XML Format
One way to understand how blocks are linked is through a blockchain in XML format. In the following table, the basic block tags are observed: <index>, which indicates the block number; <timestamp>, the timestamp of the block's registration; <data>, which contains the transaction data; <document_text>, which contains the text of the document to be registered; <previous_hash>, which is the hash key of the previous block; and <hash>, which contains the hash key of the current block.
<blockchain>
<block>
<index>1</index>
<timestamp>1623562947</timestamp>
<data>Test transaction 1</data>
<document_text>This is the text of document 1</document_text>
<previous_hash>0</previous_hash>
<hash>c3aba7f9b959d1a3a3a48c3af24ad8f6</hash>
</block>
<block>
<index>2</index>
<timestamp>1623563145</timestamp>
<data>Test transaction 2</data>
<document_text>This is the text of document 2</document_text>
<previous_hash>c3aba7f9b959d1a3a3a48c3af24ad8f6</previous_hash>
<hash>e7f0f3ca3a8a2b1a3c3d9b8a1f8c7d6e</hash>
</block>
</blockchain>
In this example, it is demonstrated how block 2 takes the hash of block 1, thereby forming a link in the chain.
PHP Program to Create Blocks
In addition to programs for verifying blocks and confirming transactions on the chain, an essential program is the addition of new blocks. This can be supported by a database, as suggested in the following example. In this case, the hash of the last block is queried before adding the next one.
<?php
// Establish the database connection
$conn = mysqli_connect("host", "username", "password", "database_name");
// Verify the connection
if (!$conn) {
die("Connection failed: " . mysqli_connect_error());
}
// Retrieve the previous hash
$previous_hash = "0xabcd1234";
$sql = "SELECT hash FROM blockchain ORDER BY index DESC LIMIT 1";
$result = mysqli_query($conn, $sql);
if (mysqli_num_rows($result) > 0) {
$row = mysqli_fetch_assoc($result);
$previous_hash = $row["hash"];
}
// Block data
$index = 123;
$timestamp = time();
$data = "Some important data here";
$hash = hash("sha256", $index . $timestamp . $data . $previous_hash);
// Insert the block into the database table
$sql = "INSERT INTO blockchain (index, timestamp, data, previous_hash, hash)
VALUES ($index, $timestamp, '$data', '$previous_hash', '$hash')";
if (mysqli_query($conn, $sql)) {
echo "Block added successfully";
} else {
echo "Error adding block: " . mysqli_error($conn);
}
// Close the database connection
mysqli_close($conn);
?>
To this code, the function for verifying the integrity of the blockchain should be added, before, during, and after the addition of a new block, propagating the data of the new block across the entire network of nodes while simultaneously obtaining confirmation from all nodes in the network that the blockchain continues to maintain the integrity and immutability of hashes and contents. In other words, the verification effort is far from negligible in this type of system, which aims to provide the highest possible level of security.
Blockchain in Documentation
As explained, blockchain technology offers advantages that can be leveraged in documentation to ensure information security, immutability, and protection against any form of alteration, thereby facilitating the implementation of anti-fraud information systems, digital libraries, archival systems, and secure information processing. Some relevant applications could include the following:
a) Information retrieval. By creating a distributed record of information, it is possible to ensure that the content available through search engines has not been altered or modified, allowing the presentation of different versions of the web stored in cache, each associated with distinct hashes that verify their authenticity. It should be noted that one of the key problems of the Web is the ease with which content can be changed, necessitating special security measures to guarantee its integrity. This enables registered information sources to be easily traced, making the historical record of information and documents under the blockchain transparent. This facilitates the recovery of historical, relevant, or important documents, irrespective of permalink usage.
b) Archival Science. It can be employed in the context of archival science to enhance the preservation and retrieval of important digital/electronic documents. It can be used to store metadata of these documents and key portions representative of facts and evidence, facilitating the identification of documentary units and the detection of valuable information. Moreover, the immutability of blockchain enables secure verification and authentication of archives, which is essential for maintaining the integrity of archival records. Additionally, it allows for greater traceability of stored documents through the property of hash chaining, facilitating their retrieval by ranges or batches and access to the relevant information of each document.
c) Cloud Documents. In the context of cloud storage, it can ensure the privacy of stored documents, as they can be encrypted or verified on secure servers using hashes stored on the chain, detecting any alterations. It is also possible to monitor access to cloud documents by recording accesses and connected users on the blockchain, thereby guaranteeing that only authorized individuals may access the information. Furthermore, the well-known property of immutability allows users to trust the integrity of the stored information, enabling greater transparency and traceability of content. This would endow digital libraries and virtual documentation centers with the characteristics of secure digital repositories.
d) Information and documentation communities. Within the context of information and documentation communities, blockchain can enhance collaboration and interaction among members. A distributed and secure registry of documents and information shared among community members can be established, facilitating collaboration and decision-making. This would lead to the development of social networks with traceable, secure, and post-hoc non-tamperable content. Due to the traceability of information shared on the blockchain, it also facilitates monitoring and tracking of the community’s activity progress, thereby enabling altmetric analysis. Finally, members of the social network can trust the integrity of the shared information, increasing confidence and security in collaboration and information exchange.
Conclusions
- Blockchain technology can serve as an effective solution for archival applications, digital libraries, scientific repositories, and even professional collaboration social networks, due to the properties of immutability, security, transparency, and traceability of information. Content is recorded in blocks that cannot be altered or modified once finalized. This is advantageous for preventing fraud or unauthorized post-editing of content.
- Another beneficial property of blockchain is its decentralized management across networks of nodes, meaning that all nodes contain copies of the transactions carried out, thereby ensuring greater security, since any attempt to alter or modify the content will be detected by inconsistencies in the hashes by the other nodes in the network, prompting their restoration and backup. By guaranteeing the integrity of information, datasets, and scientific evidence—that is, scholarly documentation—it can be stored under appropriate security conditions.
- User access control to applications and documental chains and processes is another key aspect, amenable to automation. The authenticity of user credentials can be verified. This also transforms it into a payment system linked to cryptocurrencies, which could serve as a solution for creating a credit or reward system for authors and content creators, providing a method of remuneration that safeguards intellectual property and ensures fair compensation.
Bibliography
- Abid, H. (2021). Uses of blockchain technologies in library services. Library Hi Tech News, 38(8), 9-11. https://doi.org/10.1108/LHTN-08-2020-0079
- Asadnia, A., CheshmehSohrabi, M., Shabani, A., Asemi, A., & Demneh, M. T. (2022). Future of information retrieval systems and the role of library and information science experts in their development. Journal of Librarianship and Information Science, 09610006211067537. https://doi.org/10.1177/09610006211067537
- Bashir, F., & Warraich, N. F. (2022). Prospects of Semantic Web and Blockchain Technologies in Libraries. In Blockchain and Deep Learning: Future Trends and Enabling Technologies (pp. 31-45). Cham: Springer International Publishing. https://doi.org/10.1007/978-3-030-95419-2_2
- Coghill, J. G. (2018). Blockchain and its implications for libraries. Journal of Electronic Resources in Medical Libraries, 15(2), 66-70. https://doi.org/10.1080/15424065.2018.1483218
- Dolan, L., Kavanaugh, B., Korinek, K., & Sandler, B. (2019). Off the chain: Blockchain technology—An information organization system. Technical Services Quarterly, 36(3), 281-295. https://doi.org/10.1080/07317131.2019.1621571
- Gramoli, V. (2022). Blockchain Fundamentals. In Blockchain Scalability and its Foundations in Distributed Systems (pp. 17-39). Cham: Springer International Publishing. https://doi.org/10.1007/978-3-031-12578-2_3
- Gupta, S., Sinha, S., & Bhushan, B. (2020, April). Emergence of blockchain technology: Fundamentals, working and its various implementations. In Proceedings of the International Conference on Innovative Computing & Communications (ICICC). https://dx.doi.org/10.2139/ssrn.3569577
- Hamilton, M. (2020). 15.1 Macro trends in libraries and information science. Future Directions in Digital Information: Predictions, Practice, Participation, 267. http://doi.org/10.1016/B978-0-12-822144-0.00015-X
- Hoy, M. B. (2017). An introduction to the blockchain and its implications for libraries and medicine. Medical reference services quarterly, 36(3), 273-279. https://doi.org/10.1080/02763869.2017.1332261
- Hussaini, M. S., Haruna, M., & Shrivastava, D. K. (2022). Blockchain: The Gateway to New Technology and its Applications for Academic Libraries. Journal of Information Technology and Sciences (e-ISSN: 2581-849X), 8(1), 12-21. http://dx.doi.org/10.46610/JOITS.2022.v08i01.003
- Lengoatha, L., & F. Seymour, L. (2020, September). Determinant factors of intention to adopt blockchain technology across academic libraries. In Conference of the South African Institute of Computer Scientists and Information Technologists 2020 (pp. 244-250). https://doi.org/10.1145/3410886.3410905
- Marciano, R., Lemieux, V., Hedges, M., Esteva, M., Underwood, W., Kurtz, M., & Conrad, M. (2018). Archival records and training in the age of big data. In Re-Envisioning the MLS: Perspectives on the future of library and information science education. Emerald Publishing Limited. https://doi.org/10.1108/S0065-28302018000044B010
- Ro, J. Y., & Noh, Y. (2022). A Study on the Introduction of Library Services Based on Blockchain. Journal of the Korean BIBLIA Society for library and Information Science, 33(1), 371-401. https://doi.org/10.14699/kbiblia.2022.33.1.371
- Safdar, M., Qutab, S., Ullah, F. S., Siddique, N., & Khan, M. A. (2022). A mapping review of literature on Blockchain usage by libraries: Challenges and opportunities. Journal of Librarianship and Information Science, 09610006221090225. https://doi.org/10.1177/09610006221090225
- Shahmirzadi, T. (2023). Application of blockchain technology in libraries and information centers. Agricultural Information Sciences and Technology. https://doi.org/10.22092/jaist.2023.361080.1081
- Suman, A. K., & Patel, M. (2021). An Introduction to Blockchain Technology and Its Application in Libraries. Available at SSRN 4019394. https://dx.doi.org/10.2139/ssrn.4019394
- Tella, A. (2020). Repackaging LIS professionals and libraries for the fourth industrial revolution. Library Hi Tech News, 37(8), 1-6. https://doi.org/10.1108/LHTN-02-2020-0016
- Ugarte, H. (2017). A more pragmatic Web 3.0: linked blockchain data. Bonn, Germany. http://dx.doi.org/10.13140/RG.2.2.10304.12807/1