Data Storage and File Compression | O Level Computer Science 2210 & IGCSE Computer Science 0478 | Detailed Free Notes To Score An A Star (A*)

Measurement of Data Storage

Bit
- Basic unit of computer memory storage
- It either has a value of 0 or 1.
- Word comes from Binary digIT.
Byte
- Smallest unit of memory in computer
- 1 byte has 8 bits.
Nibble
- Half the byte
- Contains 4 bits
One byte doesn’t allow to store much information so memory size is in the following multiples
- 1 KB or 1 kilobyte is 1000 bytes
- 1 MB or 1 megabyte is 1000000 bytes
- 1 GB or 1 gigabyte is 1 000 000 000 bytes
- 1 TB or 1 terabyte is 1 000 000 000 000 bytes
- 1 PB or 1 petabyte is 1 000 000 000 000 000 bytes
- 1 EB or 1 exabyte is 1 000 000 000 000 000 000 bytes
It is only focused on some storage devices
- Technically inaccurate
- It uses SI base 10 system where 1 kilo is 1000.
Memory size is actually measured in terms of powers of 2.
Therefore, the system adopted by the International Electrotechnical Commission (IEC) is based on the binary system
- 1 KiB or 1 kibibyte is 2^10 or 1024 bytes
- 1 MiB or 1 mebibyte is 2^20 or 1048576 bytes
- 1 GiB or 1 gibibyte is 2^30 or 1073741824 bytes
- 1 TiB or 1 tebibyte is 2^40 or 1099511627776 bytes
- 1 PiB or 1 pebibyte is 2^50 or 1125899906842624 bytes
- 1 EiB or 1 exbibyte is 2^60 or 1152921504606846976 bytes
This system is more accurate
- Internal memories such as RAM or ROM are measured through this system.

Calculation of File Size

File size of an image is as follows
- Image resolutions (in pixels) x Color Depth (in bits)
Mono Sound File
- Sample Rate (in Hz) x Sample Resolution (in bits) x Length of Sample (in seconds)
Stereo Sound File
- The result of the mono sound file calculation will be multiplied with 2

Data Compression

Necessary to reduce or compress the file size
- Save storage space
- Reduce time take for streaming
- Reduce time taken for download, transfer or upload
- Bandwidth
  - The maximum rate of transfer of data across a network, measured in bits per second.
  - It is used up when we upload or download something
  - Fewer bits in compressed files will use less bandwidth, ensuring faster transfer
- Reduces costs
  - Cloud storage costs are based on the size of data stored
  - ISP will charge less with less amount of data transferred

Lossy and Lossless file Compression

Lossy File Compression
- Algorithm eliminates unnecessary data from the file.
- Original file can not be reconstructed once compressed
- Some loss of detail occurs
- The algorithm decides which parts of the file can be discarded.
- Lossy compression
  - Will reduce the resolution/ bit or color depth of images
  - Sound file may see a fall in sampling rate or resolution
- Final file is smaller than lossless files
  - Benefits in storage issues
  - Benefits in data transfer rate requirements
  - Common lossy file compression
    - MPEG-3 (MP3)
      - MP3 files are used for playing music
      - It is a compression technology that reduces the size of normal music files by about 90%.
      - MP3 files are never of the same quality of a CD or DVD
        
        It is still satisfactory in most cases
      - The algorithm removes the sounds that human ear can’t hear
        
        Sounds outside the human ear listening range
        
        Perceptual Music Shaping
        
        If two sounds are playing at the same time, the louder sound can only be heard by our ear
        
        Therefore, the softer sound is eliminated
    - MPEG-4 (MP4)
      - Allows storage of multimedia files instead of just sound.
      - It retains acceptable quality of sounds and videos
      - No real loss in discernable quality
      - Usually we can stream videos and music online in this format
    - JPEG
      - A raw bitmap file is very large
      - Such files are temporary
      - JPEG is a lossy compression used for bitmap images
      - A new file is formed and the original file can no longer be reconstructed
      - JPEG compression occurs on two key concepts
        
        Human eye can not detect the differences in color shades as well as it determines the image brightness differences.
        
        Our eyes are more sensitive to brightness variations compared to color variations
        
        By separating pixel color from brightness, images can be split into 8 x 8 pixel blocks.
        
        Certain information can then be discarded
        
        It will not cause any real of noticeable deterioration in quality.
Lossless File Compression
- All data from the original uncompressed file can be reconstructed
  - Important where any data loss will be disastrous
  - For example, very large and complex spreadsheets being transferred
  - Or a very large computer applications being transferred
- Lossless file compression does not lose any data form the original file
- One method is run-length encoding (RLE)
  - Reversible file compression
  - Reduces the size of a string of adjacent, identical data
    - For example repeated alphabets or repeated colors
  - A repeating string is encoded in two values
    - First value tells the number of identical items in the run
    - The second value represents the code of the data item. For example, the keyboard character that was being repeated.
- Only effective where there is long run of repeated units/ bits.
- How RLE is used on text based data?
  - If there is a string called bbbbdddddaaaaacc.
  - Each character requires one byte, then the string requires 16 bytes.
  - Then we can use the following method using ASCII code.
    - bbbb becomes 04 (showing that 4 times it is repeated) and 98 (which is the ASCII code of b), similarly, for the d, it will be 05 100, for the a it will be 05 97 and c will be 02 99.
  - Here, we can see that 8 bytes of memory will now be required if 1 byte of memory is required for each data here. It reduces the original size b half.
  - If the data is something like cdcdcdcdcd, then we need to use a flag
    - A flag preceding the data indicates that the upcoming data has a number of repeating units.
    - If no flag used, the following data is taken at face value, and a run of 1.
  - For example
    - If the string is aaaaaaaa bbbbbbbb c d c d c d eeeeeeee
    - Then, without a flag, it will code as 0897 0898 01 99 01 100 01 99 01 100 01 99 01 100 08 101
    - Here, the original data will use 32 bytes of data, the compressed one use 18 bytes
    - One the other hand, we can use a flag, at 255, which will reduce the over all bytes used.
      - 255 08 97 255 10 98 99 100 99 100 99 100 255 08 101
      - 15 values used so 15 bytes required only.
- RLE can be used with images
  - It can be used for colored and uncolored images
- Real life reductions are not very large when using lossless compression
  - Other data such as file header etc. have to be stored as well.

Free Material

Data Storage and File Compression | O Level Computer Science 2210 & IGCSE Computer Science 0478 | Detailed Free Notes To Score An A Star (A*)

Measurement of Data Storage

Calculation of File Size

Data Compression

Lossy and Lossless file Compression

Text, Sound and Images | O Level Computer Science 2210 & IGCSE Computer Science 0478 | Detailed Free Notes To Score An A Star (A*)

Types and Methods of Data Transmission | O Level Computer Science 2210 & IGCSE Computer Science 0478 | Detailed Free Notes To Score An A Star (A*)

EAC

Education

Support

Other Projects

Want To Teach Online?

Free Material

Measurement of Data Storage

Calculation of File Size

Data Compression

Lossy and Lossless file Compression

Text, Sound and Images | O Level Computer Science 2210 & IGCSE Computer Science 0478 | Detailed Free Notes To Score An A Star (A*)

Types and Methods of Data Transmission | O Level Computer Science 2210 & IGCSE Computer Science 0478 | Detailed Free Notes To Score An A Star (A*)

EAC

Education

Support

Other Projects

Want To Teach Online?

Login with your site account