FACTOID # 11: Oklahoma has the highest rate of women in State or Federal correctional facilities.
 Home   Encyclopedia   Statistics   States A-Z   Flags   Maps   FAQ   About 
People who viewed "LZW" also viewed:


FACTS & STATISTICS    Advanced view

Search encyclopedia, statistics and forums:



(* = Graphable)



LZW (Lempel-Ziv-Welch) is an implementation of a lossless data compression algorithm created by Abraham Lempel and Jacob Ziv. It was published by Terry Welch in 1984 as an improved version of the LZ78 dictionary coding algorithm developed by Lempel and Ziv. (See also LZMA, LZ77). Lossless data compression is a class of data compression algorithms that allows the exact original data to be reconstructed from the compressed data. ... Flowcharts are often used to represent algorithms. ... Abraham Lempel is a computer scientist and one of the fathers of the LZ family of lossless data compression algorithms. ... Jacob Ziv, along with Abraham Lempel, developed the lossless LZ77 compression algorithm. ... 1984 (MCMLXXXIV) is a leap year starting on Sunday of the Gregorian calendar. ... LZ77 and LZ78 are the names for the two lossless data compression algorithms published in papers by Abraham Lempel and Jacob Ziv in 1977 and 1978. ... 1. ... LZMA, short for Lempel-Ziv-Markov chain-Algorithm, is a data compression algorithm in development until 2001 and used in the 7z format of the 7-Zip archiver and by StuffitX. It uses a dictionary compression scheme somewhat similar to LZ77 and features a high compression ratio (generally higher than... LZ77 and LZ78 are the names for the two lossless data compression algorithms published in papers by Abraham Lempel and Jacob Ziv in 1977 and 1978. ...


Description of the algorithm

The key insight of the method is that it is possible to automatically build a dictionary of previously seen strings in the text being compressed. The dictionary does not have to be transmitted with the compressed text, since the decompressor can build it the same way the compressor does, and if coded correctly, will have exactly the same strings that the compressor dictionary had at the same point in the text. ... In computer programming and some branches of mathematics, strings are sequences of various simple objects. ...

The dictionary starts off with 256 entries, one for each possible character (single byte string). Every time a string not already in the dictionary is seen, a longer string consisting of that string appended with the single character following it in the text, is stored in the dictionary.

The output consists of integer indices into the dictionary. These initially are 9 bits each, and as the dictionary grows, can increase to up to 16 bits. A special symbol is reserved for "flush the dictionary" which takes the dictionary back to the original 256 entries, and 9 bit indices. This is useful if compressing a text which has variable characteristics, since a dictionary of early material is not of much use later in the text. The integers consist of the positive natural numbers (1, 2, 3, …), their negatives (−1, −2, −3, ...) and the number zero. ... This article is about the unit of information. ...

This use of variably increasing index sizes is one of Welch's contributions. Another was to specify an efficient data structure to store the dictionary.

Simple example of dictionary-based compression algorithm

In general, dictionary-based compression replaces phrases with tokens. If the number of bits in the token is less than the number of bits in the phrase, compression will occur. Uncompressed text:

 "I am dumb and because I am dumb, I can't even tell you that I am dumb." 

Compressed text:

 "$1 and because $1, I can't even tell you that $1. $1=[I am dumb]" 

This is very far from efficient, but it illustrates the compression through phrase replacement.


The method became moderately widely used in the program "compress" which became a more or less standard utility in Unix systems circa 1986. (It has since disappeared from many for both legal and technical reasons.) Several other popular compression utilities also used the method, or closely related ones. Wikibooks has more about this subject: Guide to UNIX Unix or UNIX is a computer operating system originally developed in the 1960s and 1970s by a group of AT&T Bell Labs employees including Ken Thompson, Dennis Ritchie, and Douglas McIlroy. ...

It became very widely used after it became part of the GIF image format in 1987. It may also (optionally) be used in TIFF files. GIF (Graphics Interchange Format) is a bitmap image format for pictures with up to 256 distinct colours. ... 1987 (MCMLXXXVII) is a common year starting on Thursday of the Gregorian calendar. ... This article is about TIFF, the computer image format. ...

LZW compression provided a better compression ratio, in most applications, than any well-known method available up to that time. It became the first widely used general purpose data compression method on computers. On large English texts, it would typically compresses to about half the original size. Other kinds of data were also quite usefully compressed in many cases. The English language is a West Germanic language that originates in England. ...

Patent issues

Various patents have been issued in the USA and other countries for LZW and similar algorithms. LZ78 was covered by U.S. Patent 4,464,650 by Lempel, Ziv, Cohn, and Eastman, assigned to Sperry Corporation, later Unisys Corporation, filed on August 10, 1981, and presumably now expired. Two US patents were issued for the LZW algorithm: U.S. Patent 4,814,746 by Victor S. Miller and Mark N. Wegman and assigned to IBM, originally filed on June 1, 1983, and U.S. Patent 4,558,302 by Welch, assigned to Sperry Corporation, later Unisys Corporation, filed on June 20, 1983. A patent is a set of exclusive rights granted by a state to a person for a fixed period of time in exchange for the regulated, public disclosure of certain details of a device, method, process or composition of matter (substance) (known as an invention) which is new, inventive and... Sperry Corporation was a major American equipment and electronics company whose existence spanned more than seven decades of the twentieth century. ... Unisys Corporation (NYSE: UIS) is a provider of information technology services and solutions with operations across the world. ... August 10 is the 222nd day of the year (223rd in leap years) in the Gregorian Calendar. ... 1981 (MCMLXXXI) is a common year starting on Thursday of the Gregorian calendar. ... International Business Machines Corporation (IBM, or colloquially, Big Blue) (NYSE: IBM) (incorporated June 15, 1911, in operation since 1888) is headquartered in Armonk, New York, USA. The company manufactures and sells computer hardware, software, and services. ... June 1 is the 152nd day of the year in the Gregorian calendar (153rd in leap years), with 213 days remaining. ... 1983 (MCMLXXXIII) is a common year starting on Saturday of the Gregorian calendar. ... June 20 is the 171st day of the year (172nd in leap years) in the Gregorian Calendar, with 194 days remaining. ...

US Patent 4,558,302 is the one that has caused the most controversy. Unisys at one time granted royalty-free patent licenses to developers of free software and freely available proprietary software; the company terminated those licenses in August 1999. Many legal experts have concluded that the patent does not cover devices that can only uncompress LZW data and cannot compress it; for this reason, the popular gzip program can read .Z files but cannot write them. Free software, as defined by the Free Software Foundation, is software which can be used, copied, studied, modified and redistributed without restriction. ... Freeware is computer software which is: Made available free of charge. ... It has been suggested that closed source be merged into this article or section. ... 1999 (MCMXCIX) was a common year starting on Friday, and was designated the International Year of Older Persons by the United Nations. ... gzip is short for GNU zip, a GNU free software replacement for the Unix compress program. ...

It was reported in Debian Weekly News based on a comp.compression thread, that the Unisys patent in the USA expired on December 20, 2002 - 17 years and 10 days after it was granted. Most other sources claim the patent expired in June 2003, 20 years after it was filed, because 35 USC ยง154(c)(1) specifies that patents subsisting as of six months after the enactment of the Uruguay Round Agreements Act last for the greater of 17 years after grant and 20 years after filing. December 20 is the 354th day of the year (355th in leap years) in the Gregorian calendar. ... 2002 (MMII) was a common year starting on Tuesday of the Gregorian calendar. ... 2003 (MMIII) was a common year starting on Wednesday of the Gregorian calendar. ... See Uruguay Round and General Agreement on Tariffs and Trade Rounds item #8 for more information. ...

According to a statement on Unisys's web site, counterpart patents on LZW in the United Kingdom, France, Germany, Italy, and Japan have expired in June 2004, and the Canadian patent expired on July 7, 2004. 2004 (MMIV) was a leap year starting on Thursday of the Gregorian calendar. ... July 7 is the 188th day of the year (189th in leap years) in the Gregorian Calendar, with 177 days remaining. ... 2004 (MMIV) was a leap year starting on Thursday of the Gregorian calendar. ...

IBM's US patent expired on August 11, 2006. August 11 is the 223rd day of the year (224th in leap years) in the Gregorian Calendar. ... 2006 (MMVI) is a common year starting on Sunday of the Gregorian calendar. ...

Lempel-Ziv-Welch vs. Ziv-Lempel-Welch

Although the LZW acronym obviously refers to the inventors as Lempel, Ziv and Welch, some people claim the intellectual property rights go to Ziv first, so the method must be called the Ziv-Lempel-Welch algorithm, and not the Lempel-Ziv-Welch algorithm. Others who like distinguishing between the algorithm and the code prefer calling the algorithm LZ and the code written by Welch as LZW. Acronyms and initialisms are abbreviations formed from the initial letter or letters of words, such as NATO and XHTML, and are pronounced in a way that is distinct from the full pronunciation of what the letters stand for. ...

External links

  • United States Patent 4,558,302
  • "LZW Data Compression", by Mark Nelson (DDJ Article with source code)
  • Sad day... GIF patent dead at 20 (Short and possibly an oversimplification of the true story, a more detailed history can be found on the GIF page)
  • Bringing back LZW compression by Nathan Willis
  • List of LZW libraries, papers and other resources

  Results from FactBites:
The GIF Controversy: A Software Developer's Perspective (lzw.info) (4892 words)
LZW itself is a refinement of other algorithms published in the years before (Ziv-Lempel and others).
The LZW patent, as well as its international counterparts, and similar patents filed by others, are expected to remain valid for at least 20 years from the original filing date of June 20, 1983, i.e.
While the original article on LZW was published in 1984, the LZW patent issue first surfaced in the press in 1989, when the BTLZ algorithm (a procedure similar to LZW developed and patented by British Telecom) was to be approved for data compression into the V.42bis modem standard.
Application Note about LZW Compression (638 words)
LZW (short for Lempel-Ziv-Welch) is a lossless compression technique that is an optional part of the GIF and TIFF image file formats.
LZW compression is commonly used for 1- through 8-bit palette color images and less often for 24-bit RGB images.
Notice that for the 24-bit image LZW (or run length encoding) is not really compressing but actually increasing the space requirements.
  More results at FactBites »



Share your thoughts, questions and commentary here
Your name
Your comments

Want to know more?
Search encyclopedia, statistics and forums:


Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms, 1022, m