How to Read CSV file in JAVA

By | April 9, 2018

How to Read CSV file in JAVA – In this java tutorial i will show you about parsing CSV data in java code or import data from CSV file in Java. A Comma-Separated Values (CSV) file is just a normal plain-text file, store data in column by column, and split it by a separator (e.g normally it is a comma “,”).

For example data in CSV file can be like this:

"1","US","United States"
"2","MY","Malaysia"
"3","AU","Australia"
or
1,US,United States
2,MY,Malaysia
3,AU,Australia
Note
Read this RFC4180 document for Comma-Separated Values (CSV) format.

In a CSV file, normally there are two issues:

1. The field containing separator, for example, separator is a comma, and the field containing commas:

"Tiket","Traveloka,Tokopedia","Blibli"

2. The double-quotes are used to enclose fields, and the field containing double-quotes. To solve it, a double-quote appearing inside a field must be escaped by preceding it with another double quote (RFC4180)

"Gojek","Grab""Uber","BlueBird"

In this tutorial, we show you three examples to read, parse and print out the values from a CSV file.

  1. Simple solution to parse a simple formatted CSV file.
  2. Advance solution to parse a weird formatted CSV file (field containing separator or double-quotes)
  3. Third party solution, OpenCSV example.

1. Simple Solution Read CSV file in JAVA

If you are sure the CSV files doesn’t contain “separator or double-quotes”, just use the standard split() to parse the CSV file.

1.1 Review a simple CSV file

D:\\country.csv
"1.0.0.0","1.0.0.255","16777216","16777471","AU","Australia"
"1.0.1.0","1.0.3.255","16777472","16778239","CN","China"
"1.0.4.0","1.0.7.255","16778240","16779263","AU","Australia"
"1.0.8.0","1.0.15.255","16779264","16781311","CN","China"
"1.0.16.0","1.0.31.255","16781312","16785407","JP","Japan"
"1.0.32.0","1.0.63.255","16785408","16793599","CN","China"
"1.0.64.0","1.0.127.255","16793600","16809983","JP","Japan"
"1.0.128.0","1.0.255.255","16809984","16842751","TH","Thailand"

1.2 No magic, just read above text file, and splits it by a comma separator.

CSVReader.csv
package com.cimblo.csv;

import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;

public class CSVReader {

    public static void main(String[] args) {

        String csvFile = "D:\\country.csv";
        BufferedReader br = null;
        String line = "";
        String cvsSplitBy = ",";

        try {

            br = new BufferedReader(new FileReader(csvFile));
            while ((line = br.readLine()) != null) {

                // use comma as separator
                String[] country = line.split(cvsSplitBy);

                System.out.println("Country [code= " + country[4] + " , name=" + country[5] + "]");
            }
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            if (br != null) {
                try {
                    br.close();
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }
        }
    }
}

1.3 For JDK 7 and above, uses try-resources.

CSVReader.csv
package com.cimblo.csv;

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;

public class CSVReader {

    public static void main(String[] args) {

        String csvFile = "D:\\country.csv";
        String line = "";
        String cvsSplitBy = ",";

        try (BufferedReader br = new BufferedReader(new FileReader(csvFile))) {

            while ((line = br.readLine()) != null) {

                // use comma as separator
                String[] country = line.split(cvsSplitBy);

                System.out.println("Country [code= " + country[4] + " , name=" + country[5] + "]");

            }

        } catch (IOException e) {
            e.printStackTrace();
        }

    }

}

Output

Country [code= "AU" , name="Australia"]
Country [code= "CN" , name="China"]
Country [code= "AU" , name="Australia"]
Country [code= "CN" , name="China"]
Country [code= "JP" , name="Japan"]
Country [code= "CN" , name="China"]
Country [code= "JP" , name="Japan"]
Country [code= "TH" , name="Thailand"]

2. Advance Solution How to Read CSV file in JAVA

This solution will solve the field containing “separator or double-quotes” issue, and also support the custom separator and custom enclosed field. Review the following CSV parsing example and also the JUnit test cases to understand how it works.

Note
Again, if a double-quote appearing inside a field must be escaped by preceding it with another double quote, for example

"aaa","b""bb","ccc"

2.1 Review another CSV file

D:\\country2.csv
10,AU,Australia
11,AU,Aus""tralia
"12","AU","Australia"
"13","AU","Aus""tralia"
"14","AU","Aus,tralia"

2.2 The below example is inspired by this third party OpenCSV library.

CSVUtils.java
package com.cimblo.utils;

import java.io.File;
import java.util.ArrayList;
import java.util.List;
import java.util.Scanner;

public class CSVUtils {

    private static final char DEFAULT_SEPARATOR = ',';
    private static final char DEFAULT_QUOTE = '"';

    public static void main(String[] args) throws Exception {

        String csvFile = "D:\\country2.csv";

        Scanner scanner = new Scanner(new File(csvFile));
        while (scanner.hasNext()) {
            List<String> line = parseLine(scanner.nextLine());
            System.out.println("Country [id= " + line.get(0) + ", code= " + line.get(1) + " , name=" + line.get(2) + "]");
        }
        scanner.close();

    }

    public static List<String> parseLine(String cvsLine) {
        return parseLine(cvsLine, DEFAULT_SEPARATOR, DEFAULT_QUOTE);
    }

    public static List<String> parseLine(String cvsLine, char separators) {
        return parseLine(cvsLine, separators, DEFAULT_QUOTE);
    }

    public static List<String> parseLine(String cvsLine, char separators, char customQuote) {

        List<String> result = new ArrayList<>();

        //if empty, return!
        if (cvsLine == null && cvsLine.isEmpty()) {
            return result;
        }

        if (customQuote == ' ') {
            customQuote = DEFAULT_QUOTE;
        }

        if (separators == ' ') {
            separators = DEFAULT_SEPARATOR;
        }

        StringBuffer curVal = new StringBuffer();
        boolean inQuotes = false;
        boolean startCollectChar = false;
        boolean doubleQuotesInColumn = false;

        char[] chars = cvsLine.toCharArray();

        for (char ch : chars) {

            if (inQuotes) {
                startCollectChar = true;
                if (ch == customQuote) {
                    inQuotes = false;
                    doubleQuotesInColumn = false;
                } else {

                    //Fixed : allow "" in custom quote enclosed
                    if (ch == '\"') {
                        if (!doubleQuotesInColumn) {
                            curVal.append(ch);
                            doubleQuotesInColumn = true;
                        }
                    } else {
                        curVal.append(ch);
                    }

                }
            } else {
                if (ch == customQuote) {

                    inQuotes = true;

                    //Fixed : allow "" in empty quote enclosed
                    if (chars[0] != '"' && customQuote == '\"') {
                        curVal.append('"');
                    }

                    //double quotes in column will hit this!
                    if (startCollectChar) {
                        curVal.append('"');
                    }

                } else if (ch == separators) {

                    result.add(curVal.toString());

                    curVal = new StringBuffer();
                    startCollectChar = false;

                } else if (ch == '\r') {
                    //ignore LF characters
                    continue;
                } else if (ch == '\n') {
                    //the end, break!
                    break;
                } else {
                    curVal.append(ch);
                }
            }
        }
        result.add(curVal.toString());
        return result;
    }
}

Output

Country [id= 10, code= AU , name=Australia]
Country [id= 11, code= AU , name=Aus"tralia]
Country [id= 12, code= AU , name=Australia]
Country [id= 13, code= AU , name=Aus"tralia]
Country [id= 14, code= AU , name=Aus,tralia]

3.3 Review the following unit test, it test the “commas and double quotes” issue.

CSVUtilsTest.java
package com.cimblo.csv;

import com.cimblo.utils.CSVUtils;
import org.hamcrest.core.IsNull;
import org.junit.Test;
import java.util.List;
import static org.hamcrest.MatcherAssert.assertThat;
import static org.hamcrest.core.Is.is;

public class CSVUtilsTest {

    @Test
    public void test_no_quote() {

        String line = "10,AU,Australia";
        List<String> result = CSVUtils.parseLine(line);

        assertThat(result, IsNull.notNullValue());
        assertThat(result.size(), is(3));
        assertThat(result.get(0), is("10"));
        assertThat(result.get(1), is("AU"));
        assertThat(result.get(2), is("Australia"));

    }

    @Test
    public void test_no_quote_but_double_quotes_in_column() throws Exception {

        String line = "10,AU,Aus\"\"tralia";

        List<String> result = CSVUtils.parseLine(line);
        assertThat(result, IsNull.notNullValue());
        assertThat(result.size(), is(3));
        assertThat(result.get(0), is("10"));
        assertThat(result.get(1), is("AU"));
        assertThat(result.get(2), is("Aus\"tralia"));

    }

    @Test
    public void test_double_quotes() {

        String line = "\"10\",\"AU\",\"Australia\"";
        List<String> result = CSVUtils.parseLine(line);

        assertThat(result, IsNull.notNullValue());
        assertThat(result.size(), is(3));
        assertThat(result.get(0), is("10"));
        assertThat(result.get(1), is("AU"));
        assertThat(result.get(2), is("Australia"));

    }

    @Test
    public void test_double_quotes_but_double_quotes_in_column() {

        String line = "\"10\",\"AU\",\"Aus\"\"tralia\"";
        List<String> result = CSVUtils.parseLine(line);

        assertThat(result, IsNull.notNullValue());
        assertThat(result.size(), is(3));
        assertThat(result.get(0), is("10"));
        assertThat(result.get(1), is("AU"));
        assertThat(result.get(2), is("Aus\"tralia"));

    }

    @Test
    public void test_double_quotes_but_comma_in_column() {

        String line = "\"10\",\"AU\",\"Aus,tralia\"";
        List<String> result = CSVUtils.parseLine(line);

        assertThat(result, IsNull.notNullValue());
        assertThat(result.size(), is(3));
        assertThat(result.get(0), is("10"));
        assertThat(result.get(1), is("AU"));
        assertThat(result.get(2), is("Aus,tralia"));

    }

}

3.4 Review another unit test, it test the custom separator and custom enclosed field.

CSVUtilsTestCustom.java
package com.cimblo.csv;

import com.cimblo.utils.CSVUtils;
import org.hamcrest.core.IsNull;
import org.junit.Test;
import java.util.List;
import static org.hamcrest.MatcherAssert.assertThat;
import static org.hamcrest.core.Is.is;

public class CSVUtilsTestCustom {

    @Test
    public void test_custom_separator() {

        String line = "10|AU|Australia";
        List<String> result = CSVUtils.parseLine(line, '|');

        assertThat(result, IsNull.notNullValue());
        assertThat(result.size(), is(3));
        assertThat(result.get(0), is("10"));
        assertThat(result.get(1), is("AU"));
        assertThat(result.get(2), is("Australia"));

    }

    @Test
    public void test_custom_separator_and_quote() {

        String line = "'10'|'AU'|'Australia'";
        List<String> result = CSVUtils.parseLine(line, '|', '\'');

        assertThat(result, IsNull.notNullValue());
        assertThat(result.size(), is(3));
        assertThat(result.get(0), is("10"));
        assertThat(result.get(1), is("AU"));
        assertThat(result.get(2), is("Australia"));

    }

    @Test
    public void test_custom_separator_and_quote_but_custom_quote_in_column() {

        String line = "'10'|'AU'|'Aus|tralia'";
        List<String> result = CSVUtils.parseLine(line, '|', '\'');

        assertThat(result, IsNull.notNullValue());
        assertThat(result.size(), is(3));
        assertThat(result.get(0), is("10"));
        assertThat(result.get(1), is("AU"));
        assertThat(result.get(2), is("Aus|tralia"));

    }

    @Test
    public void test_custom_separator_and_quote_but_double_quotes_in_column() {

        String line = "'10'|'AU'|'Aus\"\"tralia'";
        List<String> result = CSVUtils.parseLine(line, '|', '\'');

        assertThat(result, IsNull.notNullValue());
        assertThat(result.size(), is(3));
        assertThat(result.get(0), is("10"));
        assertThat(result.get(1), is("AU"));
        assertThat(result.get(2), is("Aus\"tralia"));

    }

}

3. OpenCSV Example How to Read CSV file in JAVA

If you are not comfortable with above simple and advance solution, try using third party CSV library – OpenCSV.

3.1 Maven.

pom.xml
	<dependency>
		<groupId>com.opencsv</groupId>
		<artifactId>opencsv</artifactId>
		<version>3.8</version>
	</dependency>

3.2 Review a CSV file.

D:\\country3.csv
10,AU,Australia
11,AU,Aus""tralia
"12","AU","Australia"
"13","AU","Aus""tralia"
"14","AU","Aus,tralia"

3.2 OpenCSV example to parse above CSV file.

CSVReaderExample.java
package com.cimblo.csv;
import com.opencsv.CSVReader;
import java.io.FileReader;
import java.io.IOException;

public class CSVReaderExample {

    public static void main(String[] args) {

        String csvFile = "D:\\country3.csv";

        CSVReader reader = null;
        try {
            reader = new CSVReader(new FileReader(csvFile));
            String[] line;
            while ((line = reader.readNext()) != null) {
                System.out.println("Country [id= " + line[0] + ", code= " + line[1] + " , name=" + line[2] + "]");
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

Output

Country [id= 10, code= AU , name=Australia]
Country [id= 11, code= AU , name=Aus"tralia]
Country [id= 12, code= AU , name=Australia]
Country [id= 13, code= AU , name=Aus"tralia]
Country [id= 14, code= AU , name=Aus,tralia]
That is how to read and parse CSV file in java code. And to understand how the code work, i hope you try the example parsing CSV data in java above. Because if you got some case like import data from CSV file in Java, you can do with this example.
Note
Please refer to this OpenCSV official documentation for more examples.

References

  1. Ostermiller Java Utilities – Comma Separated Values (CSV)
  2. RFC4180 – Format for Comma-Separated Values (CSV)
  3. OpenCSV website
  4. How to export data to CSV file in JAVA

One thought on “How to Read CSV file in JAVA

  1. Pingback: How to Export Data to CSV file in JAVA ~ Cimblo.com

Leave a Reply

Your email address will not be published. Required fields are marked *