Use Java Named Function and Lambda to Handle Text File

Here I am experimenting with Java 8 Lambda feature to analyze the occurrence of a particular word appeared in a text file. The code below reads the file, count the occurrence of word “dNSName” and generates a report.

The code uses (1)A named function called findNumOfOccurrence; (2)Read lines stream by Files.line(); (3)Stream filter and count and  (4)A lambda expression to handle each line.


import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.text.DecimalFormat;
import java.text.MessageFormat;
import java.util.function.Function;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test {
public static void main(String[] args) throws IOException {
String inputDataFileName = "D:\\My Data\\wcsans.jul17-1.csv";
String outputFormat =
"The percentage of lines with {0} dNSNames: {1}% ({2} out of total {3})";
DecimalFormat decimalFormat = new DecimalFormat("0.00");

Function<String, Integer> findNumOfOccurrence = s -> {
Pattern pattern = Pattern.compile("dNSName");
Matcher matcher = pattern.matcher(s);
int count = 0;
while (matcher.find())
count++;
return count;
};

long total = Files.lines(Paths.get(inputDataFileName)).count();

for (int i = 0; i < 10; i++) {
final int tomatch = i;
long count =
Files.lines(Paths.get(inputDataFileName))
.filter(s -> (findNumOfOccurrence.apply(s)) == tomatch).count();

Float percentage = Float.parseFloat(decimalFormat.format((float) count * 100 / total));
System.out.println(MessageFormat.format(outputFormat, tomatch, percentage, count, total));
}
}
}

The output should look like

The percentage of lines with 0 dNSNames: 6.21% (284 out of total 4,576)
The percentage of lines with 1 dNSNames: 1.66% (76 out of total 4,576)
The percentage of lines with 2 dNSNames: 89.03% (4,074 out of total 4,576)
The percentage of lines with 3 dNSNames: 0.9% (41 out of total 4,576)
The percentage of lines with 4 dNSNames: 0.39% (18 out of total 4,576)
The percentage of lines with 5 dNSNames: 0.37% (17 out of total 4,576)
The percentage of lines with 6 dNSNames: 0.33% (15 out of total 4,576)
The percentage of lines with 7 dNSNames: 0.22% (10 out of total 4,576)
The percentage of lines with 8 dNSNames: 0.17% (8 out of total 4,576)
The percentage of lines with 9 dNSNames: 0.07% (3 out of total 4,576)

Advertisements
This entry was posted in Java, Programming and tagged , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s