CSV Catalogs

Overview

CSV catalog files are a popular method of providing buyers with product information and pricing. Typically, the supplier is responsible for exporting their product information into a CSV format, transmitting it to the buyer, and the buyer is responsible for importing the catalog into their eProcurement system. CSV catalogs can be suitable for use with smaller catalogs, where pricing changes infrequently, and the catalog does not contain configurable items.

Disadvantages

  • Many variations of file format - CSV/TSV/Newlines/Quoting

  • No standard column headings - will vary by eProcurement system

  • Character encoding is not well defined - needs to be detected or manually set prior to import

  • Configurable products are typically not supported

  • Tiered pricing is typically not supported

  • Editing in Excel can cause data loss

    • leading 0’s being stripped from part numbers

    • long numbers converted to scientific notation

Transmission Methods

Suppliers can send catalogs to their buyers manually or in an automated fashion over the internet.

Manual methods for providing catalogs to buyers include emailing the catalog and uploading the catalog to a portal provided by the buyers eprocurement system. Both of these approaches are time consuming for suppliers and therefore not recommended.

Automated methods of transmitting catalogs include uploading them to FTP/FTPS/SFTP sites or hosting them at an HTTP/HTTPS URL. Depending on the eProcurement system the buyer is using, one or more of the transmission methods may be available. If using FTP, FTPS or SFTP is recommended so the file transmission is encrypted when being uploaded and downloaded. If using HTTP, HTTPS with HTTP Basic Auth is recommended to secure the catalog file.

Character Encodings

There is no standard character encoding defined for CSV files. Furthermore, CSV files do not include metadata indicating which character encoding is used.

ASCII is commonly used to avoid issues when importing the catalog in the buyers eProcurement system. Unfortunately, ASCII does not support many characters that are used in product descriptions. Other common character encodings include ISO-8859-1 and Windows-1252 which build on ASCII to add additional characters. These character encodings should be avoided when possible because they are difficult to differentiate between in practice.

UTF-8 supports all unicode characters and generally produces smaller files than UTF-16/UTF-32. UTF-8 is the recommended format to use for CSV catalog files.

RFC 4180

The RFC 4180 standard specifies a dialect to use for CSV files. CSV writers in most programming languages can be configured to support the RFC 4180 standard when parsing/writing CSV files.

General Rules

  1. Use Windows Newlines (\r\n) to delimit records (rows)

  2. Use commas (,) to delimit values within a record (row)

  3. Use double quotes (") to enclose any fields that contain commas, double quotes, or newlines

  4. If double-quotes appear within a field, the double-quote should be escaped by preceeding it with another double-quote

Example File

part_number,description,price
123456,My Product,4.56
"23456","My Other Product","56.78"
3456,"My product with a
multi-line description",90.87
4567,"My product, double quotes "" in the description","87.65"
part_number description price
123456
My Product
4.56
23456
My Other Product
56.78
3456
My product with a multi-line description
90.87
4567
My product, double quotes " in the description
87.65

Share