FACTOID # 14: North Carolina has a larger Native American population than North Dakota, South Dakota and Montana combined.
 
 Home   Encyclopedia   Statistics   States A-Z   Flags   Maps   FAQ   About 
   
 
WHAT'S NEW
 

SEARCH ALL

FACTS & STATISTICS    Advanced view

Search encyclopedia, statistics and forums:

 

 

(* = Graphable)

 

 


Encyclopedia > Comma separated values
Comma-separated values
File extension: .csv
MIME type: text/csv
text/comma-separated-values (deprecated)

The comma-separated values (or CSV; also known as a comma-separated list or Comma-Separated Variable) file format is a file type that stores tabular data. The format dates back to the days of mainframe computing. For this reason, CSV files are common on all computer platforms. A filename extension is a suffix to the name of a computer file applied to show its format. ... Multipurpose Internet Mail Extensions (MIME) is an Internet Standard that extends the format of e-mail to support: text in character sets other than US-ASCII; non-text attachments; multi-part message bodies; and header information in non-ASCII character sets. ... A file format is a particular way to encode information for storage in a computer file. ... A simple diagram depicting conversion of a CSV-format flat file database table into a relational database table. ... Mainframe may refer to one of the following: Mainframe computer, large data processing systems Mainframe Entertainment, a Canadian computer animation and design company. ...


CSV is one implementation of a delimited text file, which uses a comma to separate values. However CSV differs from other delimiter separated file formats in using a " (double quote) character around fields that contain reserved characters (such as commas or newlines). Most other delimiter formats either use an escape character such as a backslash, or have no support for reserved characters. Delimited data uses specific characters (delimiters) to separate its values. ... A comma ( , ) is a punctuation mark. ... The symbol ″, while technically the double-prime, is also used to mean inch. ... In computing, a newline is a special character or sequence of characters signifying the end of a line of text. ... In computing and telecommunication, an escape character is one which has a special meaning in a sequence of characters. ... First introduced in 1960 by Bob Bemer, the backslash, , is a typographical mark (glyph) used chiefly in computing. ...


In computer science terms, this type of format is called a "flat file" because only one table can be stored in a CSV file. Most systems use a series of tables to store their information, which must be "flattened" into a single table, often with information repeated over several rows, to create a delimited text file. Computer science, or computing science, is the study of the theoretical foundations of information and computation and their implementation and application in computer systems. ... The term flat file can refer to a flat-file database, or to a simple type of file system for computers. ... Delimited data uses specific characters (delimiters) to separate its values. ...

A wizard importing a CSV file into MS Access 2007

Contents

Image File history File linksMetadata No higher resolution available. ... Image File history File linksMetadata No higher resolution available. ... A wizard is an interactive computer program acting as an interface to lead a user through a complex task using dialog steps. ...

Specification

While no formal specification for CSV exists, RFC 4180 describes a common format and establishes "text/csv" as the MIME type registered with the IANA. Multipurpose Internet Mail Extensions (MIME) is an Internet Standard for the format of e-mail. ... The Internet Assigned Numbers Authority (IANA) is the entity that oversees global IP address allocation, DNS root zone management, and other Internet protocol assignments. ...


Many informal documents exist that describe the CSV format. How To: The Comma Separated Value (CSV) File Format provides an overview of the CSV format in the most widely used applications and explains how it can best be used and supported.


The basic rules are as follows:


CSV is a delimited data format that has fields/columns separated by the comma character and records/rows separated by newlines. Fields that contain a special character ( comma, newline, or double quote ), must be enclosed in double quotes. However, if a line contains a single entry which is the empty string, it may be enclosed in double quotes. If a field's value contains a double quote character it is escaped by placing another double quote character next to it. The CSV file format does not require a specific character encoding, byte order, or line terminator format. Delimited data uses specific characters (delimiters) to separate its values. ... In computer science, data that has several parts can be divided into fields. ... A comma ( , ) is a punctuation mark. ... In the context of a relational database, a row, also called a record or tuple, represents a single, implicitly structured data item in a table. ... In computing, a newline is a special character or sequence of characters signifying the end of a line of text. ... The symbol ″, while technically the double-prime, is also used to mean inch. ... In computing and telecommunication, an escape character is one which has a special meaning in a sequence of characters. ... A character encoding or character set (sometimes referred to as code page) consists of a code that pairs a sequence of characters from a given set with something else, such as a sequence of natural numbers, octets or electrical pulses, in order to facilitate the storage of text in computers... When integers or any other data are represented with multiple bytes, there is no unique way of ordering of those bytes in memory or in a transmission over some medium, and so the order is subject to arbitrary convention. ...

  • Each record is one line terminated by a line feed (ASCII/LF=0x0A)or a carriage return and line feed pair (ASCII/CRLF=0x0D 0x0A), however, line-breaks can be embedded.
  • Fields are separated by commas.
 1997,Ford,E350 
  • Leading and trailing spaces or tabs, adjacent to commas, are trimmed.
 1997, Ford , E350 same as 1997,Ford,E350 
  • Fields with embedded commas must be delimited with double-quote characters.
 1997,Ford,E350,"Super, luxurious truck" 
  • Fields with embedded double-quote characters must be delimited with double-quote characters, and the embedded double-quote characters must repressented by a pair of double-quote characters.
 1997,Ford,E350,"Super ""luxurious"" truck" 
  • Fields with embedded line breaks must be delimited by double-quote characters.
 1997,Ford,E350,"Go get one now they are going fast" 
  • Fields with leading or trailing spaces must be delimited by double-quote characters.
 1997,Ford,E350," Super luxurious truck " 
  • Fields may always be delimited by double-quote characters, whether necessary or not.
 "1997",Ford,E350 
  • The first record in a csv file may contain column names in each of the fields.
 Year,Make,Model 1997,Ford,E350 2000,Mercury,Cougar 

Example

1997 Ford E350 ac, abs, moon 3000.00
1999 Chevy Venture "Extended Edition"   4900.00
1996 Jeep Grand Cherokee MUST SELL!
air, moon roof, loaded
4799.00

The above table of data may be represented in CSV format as follows:

 1997,Ford,E350,"ac, abs, moon",3000.00 1999,Chevy,"Venture ""Extended Edition""",,4900.00 1996,Jeep,Grand Cherokee,"MUST SELL! air, moon roof, loaded",4799.00 

This CSV example illustrates that:

  • fields that contain commas, double-quotes, or line-breaks must be quoted,
  • a quote within a field must be escaped with an additional quote immediately preceding the literal quote,
  • space before and after delimiter commas may be trimmed, and
  • a line break within an element must be preserved.

Application support

The CSV file format is very simple and supported by almost all spreadsheets and database management systems. Many programming languages have libraries available that support CSV files. Even modern software applications support CSV imports and/or exports because the format is so widely recognized. Many applications in fact allow .csv-named files to use any delimiter character. The comma-separated values file format is a very simple data file format that is supported by almost all spreadsheet software such as Excel (although Excel uses the list separator of the current locale settings, which is a semicolon instead of a comma for many locales), OpenOffice. ... Screenshot of a spreadsheet made with OpenOffice. ... A database management system (DBMS) is computer software designed for the purpose of managing databases. ... A programming language is an artificial language that can be used to control the behavior of a machine, particularly a computer. ...


See also

  • Delimiter-separated values

Delimited data uses specific characters (delimiters) to separate its values. ...

External links

  • RFC 4180: Common Format and MIME Type for Comma-Separated Values (CSV) Files
  • How To: The Comma Separated Value (CSV) File Format

 
 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your comments

Want to know more?
Search encyclopedia, statistics and forums:

 


Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms, 1022, m