File size is a measure of how much information a file contains or how much storage it consumes. That typically is measured in bytes with a prefix.
The BIT is the smallest unit of information in computers and the name originates from Binary digIT. A bit has two digits called states - zero and one. The zero is known as false (off) state and the one is true (on). A sequence of eight bits forms a BYTE which is the next larger unit and the file sizes are expressed in units based on it. The next larger units after the byte are named kilobyte, megabyte, gigabyte, terabyte and so on. Kilo prefix in the metric system means 1000 (thousand), but in computers it means 1024. As this leads to a great confusion the International Electrotechnical Commission (IEC) approved a new IEC International Standard in December 1998.
Here is a list of some of the commonly used units in the metric and their corresponding IEC binary prefixes:
|kilobyte (KB)||10 3||kibibyte (KiB)||2 10|
|megabyte (MB)||10 6||mebibyte(MiB)||2 20|
|gigabyte (GB)||10 9||gibibyte (GiB)||2 30|
|terabyte (TB)||10 12||tebibyte (TiB)||2 40|
|petabyte (PB)||10 15||pebibyte (PiB)||2 50|
|exabyte (EB)||10 18||exbibyte (EiB)||2 60|
|zettabyte (ZB)||10 21||zebibyte (ZiB)||2 70|
|yottabyte (YB)||10 24||yobibyte (YiB)||2 80|
File size units can use a metric prefix (like kilobytes and megabyte) or a binary prefix (like kibibytes and mebibytes). Usually a file occupies slightly larger disk space than its actual size when it is written to the file system. This is because the smallest accessible space by the file system is a sector. The sector size is specific to different media types and range from few hundred to few thousand bytes. This means that if the sector size in a file system is 4096 bytes and a file of 6780 bytes is stored it will occupy 8192 bytes on the storage. In this example it may look like much as the file size is small, but for larger files it is not such a huge overhead. The wasted space is called internal fragmentation. Though smaller sector size would make a better use of the available disk space it leads to a lower performance of the file system.
A storage device needs to be formatted in order to be able to perform any read/write operations. Formatting is the process where the file system type is designated (FAT16, FAT32, exFAT, NTFS etc.) and also other parameters like sector size are specified if the type of storage media allows it. A storage device can be formatted using different file systems depending on the needs of the user and the size of the device. (Some older file systems cannot support large capacity drives). The process of formatting creates the necessary file allocation tables (FAT) on the target media that hold the information about the files and folders. They also specify the type of the FAT for the media which information is necessary for the operating system in order to properly read and write to the storage. An operating system should support the file system of the device in order to be able to access it - different operating systems support different file systems.
The maximum file size depends not only on the size of the storage, but also on the file system that is used. FAT32 file system, for example, is limited to 4,294,967,295 bytes. (one byte less than 4 gibibytes). Below is a table with information regarding popular file systems and their properties and file size limits.
|OS||Windows 2000 / XP / 2003 Server / 2008 / Vista / 7||Windows NT / 2000 / XP / 2003 Server / 2008 / Vista / 7||Windows CE 6.0 / Vista / 7 / XP||DOS v7+ / Windows 98 / ME / 2000 / XP / 2003 Server / Vista / 7||DOS / Windows||DOS / Windows|
|Max Size||2 ^ 64 clusters – 1 cluster||2 ^ 32 clusters – 1 cluster||128PB||32GB (2TB for some OS)||2GB (4GB for some OS)||16MB|
2 ^ 32 -1
2 ^ 32 -1
|Max File Size||2 ^ 64 bytes
(16 ExaBytes) minus 1KB
|2 ^ 44 bytes
(16 TeraBytes) minus 64KB
|16EB||4GB - 1 Byte||2GB||16MB|
|Max Clusters||2 ^ 64 -1 clusters||2 ^ 32 -1 clusters||4294967295||4177918||65520||4080|
|Max File Name||255||255||255||255||8.3 (Extended 255)||254|