readcsv

Table of Contents

Prototype: readcsv(filename, optional_maxbytes)

Return type: data

Description: Parses CSV data from the file filename and returns the result as a data variable. maxbytes is optional, if specified, only the first maxbytes bytes are read from filename.

While it may seem similar to data_readstringarrayidx() and data_readstringarray(), the readcsv() function is more capable because it follows RFC 4180, especially regarding quoting. This is not possible if you just split strings on a regular expression delimiter.

The returned data is in the same format as data_readstringarrayidx(), that is, a data container that holds a JSON array of JSON arrays.

Arguments:

  • filename: string, in the range: "?(/.*)
  • optional_maxbytes: int, in the range: 0,99999999999

Example:

Prepare:

echo -n 1,2,3 > /tmp/csv

Run:

bundle agent main
{
  vars:

      # note that the CSV file has to have ^M (DOS) EOL terminators
      # thus the prep step uses `echo -n` and just one line, so it will work on Unix
      "csv" data => readcsv("/tmp/csv");
      "csv_str" string => format("%S", csv);

  reports:

      "From /tmp/csv, got data $(csv_str)";

}

Output:

R: From /tmp/csv, got data [["1","2","3"]]

Note: CSV files formatted according to RFC 4180 must end with the CRLF sequence. Thus a text file created on Unix with standard Unix tools like vi will not, by default, have those line endings.

See also: readdata(), data_readstringarrayidx(),data_readstringarray(), parsejson(), storejson(), mergedata(), and data documentation.

History: Was introduced in 3.7.0.