1 |
# Welcome to libarchive! |
2 |
|
3 |
The libarchive project develops a portable, efficient C library that |
4 |
can read and write streaming archives in a variety of formats. It |
5 |
also includes implementations of the common `tar`, `cpio`, and `zcat` |
6 |
command-line tools that use the libarchive library. |
7 |
|
8 |
## Questions? Issues? |
9 |
|
10 |
* http://www.libarchive.org is the home for ongoing |
11 |
libarchive development, including documentation, |
12 |
and links to the libarchive mailing lists. |
13 |
* To report an issue, use the issue tracker at |
14 |
https://github.com/libarchive/libarchive/issues |
15 |
* To submit an enhancement to libarchive, please |
16 |
submit a pull request via GitHub: https://github.com/libarchive/libarchive/pulls |
17 |
|
18 |
## Contents of the Distribution |
19 |
|
20 |
This distribution bundle includes the following major components: |
21 |
|
22 |
* **libarchive**: a library for reading and writing streaming archives |
23 |
* **tar**: the 'bsdtar' program is a full-featured 'tar' implementation built on libarchive |
24 |
* **cpio**: the 'bsdcpio' program is a different interface to essentially the same functionality |
25 |
* **cat**: the 'bsdcat' program is a simple replacement tool for zcat, bzcat, xzcat, and such |
26 |
* **examples**: Some small example programs that you may find useful. |
27 |
* **examples/minitar**: a compact sample demonstrating use of libarchive. |
28 |
* **contrib**: Various items sent to me by third parties; please contact the authors with any questions. |
29 |
|
30 |
The top-level directory contains the following information files: |
31 |
|
32 |
* **NEWS** - highlights of recent changes |
33 |
* **COPYING** - what you can do with this |
34 |
* **INSTALL** - installation instructions |
35 |
* **README** - this file |
36 |
* **CMakeLists.txt** - input for "cmake" build tool, see INSTALL |
37 |
* **configure** - configuration script, see INSTALL for details. If your copy of the source lacks a `configure` script, you can try to construct it by running the script in `build/autogen.sh` (or use `cmake`). |
38 |
|
39 |
The following files in the top-level directory are used by the 'configure' script: |
40 |
* `Makefile.am`, `aclocal.m4`, `configure.ac` - used to build this distribution, only needed by maintainers |
41 |
* `Makefile.in`, `config.h.in` - templates used by configure script |
42 |
|
43 |
## Documentation |
44 |
|
45 |
In addition to the informational articles and documentation |
46 |
in the online [libarchive Wiki](https://github.com/libarchive/libarchive/wiki), |
47 |
the distribution also includes a number of manual pages: |
48 |
|
49 |
* bsdtar.1 explains the use of the bsdtar program |
50 |
* bsdcpio.1 explains the use of the bsdcpio program |
51 |
* bsdcat.1 explains the use of the bsdcat program |
52 |
* libarchive.3 gives an overview of the library as a whole |
53 |
* archive_read.3, archive_write.3, archive_write_disk.3, and |
54 |
archive_read_disk.3 provide detailed calling sequences for the read |
55 |
and write APIs |
56 |
* archive_entry.3 details the "struct archive_entry" utility class |
57 |
* archive_internals.3 provides some insight into libarchive's |
58 |
internal structure and operation. |
59 |
* libarchive-formats.5 documents the file formats supported by the library |
60 |
* cpio.5, mtree.5, and tar.5 provide detailed information about these |
61 |
popular archive formats, including hard-to-find details about |
62 |
modern cpio and tar variants. |
63 |
|
64 |
The manual pages above are provided in the 'doc' directory in |
65 |
a number of different formats. |
66 |
|
67 |
You should also read the copious comments in `archive.h` and the |
68 |
source code for the sample programs for more details. Please let us |
69 |
know about any errors or omissions you find. |
70 |
|
71 |
## Supported Formats |
72 |
|
73 |
Currently, the library automatically detects and reads the following fomats: |
74 |
* Old V7 tar archives |
75 |
* POSIX ustar |
76 |
* GNU tar format (including GNU long filenames, long link names, and sparse files) |
77 |
* Solaris 9 extended tar format (including ACLs) |
78 |
* POSIX pax interchange format |
79 |
* POSIX octet-oriented cpio |
80 |
* SVR4 ASCII cpio |
81 |
* POSIX octet-oriented cpio |
82 |
* Binary cpio (big-endian or little-endian) |
83 |
* ISO9660 CD-ROM images (with optional Rockridge or Joliet extensions) |
84 |
* ZIP archives (with uncompressed or "deflate" compressed entries, including support for encrypted Zip archives) |
85 |
* GNU and BSD 'ar' archives |
86 |
* 'mtree' format |
87 |
* 7-Zip archives |
88 |
* Microsoft CAB format |
89 |
* LHA and LZH archives |
90 |
* RAR archives (with some limitations due to RAR's proprietary status) |
91 |
* XAR archives |
92 |
|
93 |
The library also detects and handles any of the following before evaluating the archive: |
94 |
* uuencoded files |
95 |
* files with RPM wrapper |
96 |
* gzip compression |
97 |
* bzip2 compression |
98 |
* compress/LZW compression |
99 |
* lzma, lzip, and xz compression |
100 |
* lz4 compression |
101 |
* lzop compression |
102 |
|
103 |
The library can create archives in any of the following formats: |
104 |
* POSIX ustar |
105 |
* POSIX pax interchange format |
106 |
* "restricted" pax format, which will create ustar archives except for |
107 |
entries that require pax extensions (for long filenames, ACLs, etc). |
108 |
* Old GNU tar format |
109 |
* Old V7 tar format |
110 |
* POSIX octet-oriented cpio |
111 |
* SVR4 "newc" cpio |
112 |
* shar archives |
113 |
* ZIP archives (with uncompressed or "deflate" compressed entries) |
114 |
* GNU and BSD 'ar' archives |
115 |
* 'mtree' format |
116 |
* ISO9660 format |
117 |
* 7-Zip archives |
118 |
* XAR archives |
119 |
|
120 |
When creating archives, the result can be filtered with any of the following: |
121 |
* uuencode |
122 |
* gzip compression |
123 |
* bzip2 compression |
124 |
* compress/LZW compression |
125 |
* lzma, lzip, and xz compression |
126 |
* lz4 compression |
127 |
* lzop compression |
128 |
|
129 |
## Notes about the Library Design |
130 |
|
131 |
The following notes address many of the most common |
132 |
questions we are asked about libarchive: |
133 |
|
134 |
* This is a heavily stream-oriented system. That means that |
135 |
it is optimized to read or write the archive in a single |
136 |
pass from beginning to end. For example, this allows |
137 |
libarchive to process archives too large to store on disk |
138 |
by processing them on-the-fly as they are read from or |
139 |
written to a network or tape drive. This also makes |
140 |
libarchive useful for tools that need to produce |
141 |
archives on-the-fly (such as webservers that provide |
142 |
archived contents of a users account). |
143 |
|
144 |
* In-place modification and random access to the contents |
145 |
of an archive are not directly supported. For some formats, |
146 |
this is not an issue: For example, tar.gz archives are not |
147 |
designed for random access. In some other cases, libarchive |
148 |
can re-open an archive and scan it from the beginning quickly |
149 |
enough to provide the needed abilities even without true |
150 |
random access. Of course, some applications do require true |
151 |
random access; those applications should consider alternatives |
152 |
to libarchive. |
153 |
|
154 |
* The library is designed to be extended with new compression and |
155 |
archive formats. The only requirement is that the format be |
156 |
readable or writable as a stream and that each archive entry be |
157 |
independent. There are articles on the libarchive Wiki explaining |
158 |
how to extend libarchive. |
159 |
|
160 |
* On read, compression and format are always detected automatically. |
161 |
|
162 |
* The same API is used for all formats; in particular, it's very |
163 |
easy for software using libarchive to transparently handle |
164 |
any of libarchive's archiving formats. |
165 |
|
166 |
* Libarchive's automatic support for decompression can be used |
167 |
without archiving by explicitly selecting the "raw" and "empty" |
168 |
formats. |
169 |
|
170 |
* I've attempted to minimize static link pollution. If you don't |
171 |
explicitly invoke a particular feature (such as support for a |
172 |
particular compression or format), it won't get pulled in to |
173 |
statically-linked programs. In particular, if you don't explicitly |
174 |
enable a particular compression or decompression support, you won't |
175 |
need to link against the corresponding compression or decompression |
176 |
libraries. This also reduces the size of statically-linked |
177 |
binaries in environments where that matters. |
178 |
|
179 |
* The library is generally _thread safe_ depending on the platform: |
180 |
it does not define any global variables of its own. However, some |
181 |
platforms do not provide fully thread-safe versions of key C library |
182 |
functions. On those platforms, libarchive will use the non-thread-safe |
183 |
functions. Patches to improve this are of great interest to us. |
184 |
|
185 |
* In particular, libarchive's modules to read or write a directory |
186 |
tree do use `chdir()` to optimize the directory traversals. This |
187 |
can cause problems for programs that expect to do disk access from |
188 |
multiple threads. Of course, those modules are completely |
189 |
optional and you can use the rest of libarchive without them. |
190 |
|
191 |
* The library is _not_ thread aware, however. It does no locking |
192 |
or thread management of any kind. If you create a libarchive |
193 |
object and need to access it from multiple threads, you will |
194 |
need to provide your own locking. |
195 |
|
196 |
* On read, the library accepts whatever blocks you hand it. |
197 |
Your read callback is free to pass the library a byte at a time |
198 |
or mmap the entire archive and give it to the library at once. |
199 |
On write, the library always produces correctly-blocked output. |
200 |
|
201 |
* The object-style approach allows you to have multiple archive streams |
202 |
open at once. bsdtar uses this in its "@archive" extension. |
203 |
|
204 |
* The archive itself is read/written using callback functions. |
205 |
You can read an archive directly from an in-memory buffer or |
206 |
write it to a socket, if you wish. There are some utility |
207 |
functions to provide easy-to-use "open file," etc, capabilities. |
208 |
|
209 |
* The read/write APIs are designed to allow individual entries |
210 |
to be read or written to any data source: You can create |
211 |
a block of data in memory and add it to a tar archive without |
212 |
first writing a temporary file. You can also read an entry from |
213 |
an archive and write the data directly to a socket. If you want |
214 |
to read/write entries to disk, there are convenience functions to |
215 |
make this especially easy. |
216 |
|
217 |
* Note: The "pax interchange format" is a POSIX standard extended tar |
218 |
format that should be used when the older _ustar_ format is not |
219 |
appropriate. It has many advantages over other tar formats |
220 |
(including the legacy GNU tar format) and is widely supported by |
221 |
current tar implementations. |
222 |
|