I need to read data from a text file line by line. Each line contains either a string or an integer. I want to use StreamReader to read line by line from the text file and StreamWriter to write it to a binary file. The "write to binary file" part will be easy. The "read from text file line by line" part is the part I need help with.
It's all built into StreamReader:
using (var sr = new StreamReader(myFile))
{
string line;
while ((line = sr.ReadLine()) != null)
{
// line is the text line
}
}
In c# you can do something like this.
string loc = "idk/where/ever";
using(var sr = new StreamReader(loc))
using(var sw = new StreamWriter(loc+".tmp"))
{
string line;
while((line=sr.ReadLine())!=null)
{
sw.WriteLine(line);
//edit it however you want
}
}
File.Delete(loc);
File.Move(loc+".tmp",loc);
Related
I have several files with datas in it.
For example: file01.csv with x lignes in it, file02.csv with y lines in it.
I would like to treat and merge them with mapreduce in order to get a file with the x lines beginning with file01 then line content, and y files beginning with file02 then line content.
I have two issues here:
I know how to get lines from a file with mapreduce by setting FileInputFormat.setInputPath(job, new Path(inputFile));
But I don't understand how I can get lines of each file of a folder.
Once I have those lines in my mapper, how can I access to the filename corresponding, so that I can create the data I want ?
Thank you for your consideration.
Ambre
You do not need map-reduce in your situation. That's because you want to preserve the order of lines in result file. In this case single thread processing will be faster.
Just run java client with code like this:
FileSystem fs = FileSystem.get();
OutputStream os = fs.create(outputPath); // stream for result file
PrintWriter pw = new PrintWriter(new OutputStreamWriter(os));
for (String inputFile : inputs) { // reading input files
InputStream is = fs.open(new Path(inputFile));
BufferedReader br = new BufferedReader(new InputStreamReader(is));
String line;
while ((line = br.readLine()) != null) {
pw.println(line);
}
br.close();
}
pw.close();
I am trying to modify a text file I am using PHP or also I can use the C# the file that I am working on a text file consists of strings for example
TM_len= --------------------------------------------
EMM_len --------------------------------------------
T_len=45 CTGCCTGAGCTCGTCCCCTGGATGTCCGGGTCTCCCCAGGCGG
NM_=2493 ----------------ATATAAAAAGATCTGTCTGGGGCCGAA
and I want to delete those four lines from the file if I found that one line consists of only "-" no characters in it and of course save to the file.
Maybe something like this? I wrote it in a easy to understand and "not-shortened" way:
$newfiledata = "";
$signature = " ";
$handle = fopen("inputfile.txt", "r"); // open file
if ($handle) {
while (($line = fgets($handle)) !== false) { // read line by line
$pos = strpos($line, $signature); // locate spaces in line text
if ($pos) {
$lastpart = trim(substr($line, $pos)); // get second part of text
$newstring = trim(str_replace('-', '', $line)); // remove all dashes
if (len($newstring) > 0) $newfiledata .= $line."\r\n"; // if still there is characters, append it to our variable
}
}
fclose($handle);
}
// write new file
file_put_contents("newfile.txt", $newfiledata);
thanks for your response but there nothing happened on the file please check the link of the file and another link of the desired output for the file.download the file and required output file
My purpose is to parse text files and store information in respective tables.
I have to parse around 100 folders having more that 8000 files and whole size approximately 20GB.
When I tried to store whole file contents in a string, memory out exception was thrown.
That is
using (StreamReader objStream = new StreamReader(filename))
{
string fileDetails = objStream.ReadToEnd();
}
Hence I tried one logic like
using (StreamReader objStream = new StreamReader(filename))
{
// Getting total number of lines in a file
int fileLineCount = File.ReadLines(filename).Count();
if (fileLineCount < 90000)
{
fileDetails = objStream.ReadToEnd();
fileDetails = fileDetails.Replace(Environment.NewLine, "\n");
string[] fileInfo = fileDetails.ToString().Split('\n');
//call respective method for parsing and insertion
}
else
{
while ((firstLine = objStream.ReadLine()) != null)
{
lineCount++;
fileDetails = (fileDetails != string.Empty) ? string.Concat(fileDetails, "\n", firstLine)
: string.Concat(firstLine);
if (lineCount == 90000)
{
fileDetails = fileDetails.Replace(Environment.NewLine, "\n");
string[] fileInfo = fileDetails.ToString().Split('\n');
lineCount = 0;
//call respective method for parsing and insertion
}
}
//when content is 90057, to parse 57
if (lineCount < 90000 )
{
string[] fileInfo = fileDetails.ToString().Split('\n');
lineCount = 0;
//call respective method for parsing and insertion
}
}
}
Here 90,000 is the bulk size which is safe to process without giving out of memory exception for my case.
Still the process is taking more than 2 days for completion. I observed this is because of reading line by line.
Is there any better approach to handle this ?
Thanks in Advance :)
You can use a profiler to detect what sucks your performance. In this case it's obvious: disk access and string concatenation.
Do not read a file more than once. Let's take a look at your code. First of all, the line int fileLineCount = File.ReadLines(filename).Count(); means you read the whole file and discard what you've read. That's bad. Throw away your if (fileLineCount < 90000) and keep only else.
It almost doesn't matter if you read line-by-line in consecutive order or the whole file because reading is buffered in any case.
Avoid string concatenation, especially for long strings.
fileDetails = fileDetails.Replace(Environment.NewLine, "\n");
string[] fileInfo = fileDetails.ToString().Split('\n');
It's really bad. You read the file line-by-line, why do you do this replacement/split? File.ReadLines() gives you a collection of all lines. Just pass it to your parsing routine.
If you'll do this properly I expect significant speedup. It can be optimized further by reading files in a separate thread while processing them in the main. But this is another story.
I want to read an Unicode file (UTF-8) and write it back to another file.
Code I used for reading is, (As in Textscreen in Codename One, how to read text file?)
final String textFile = "/readme.txt";
String text = "";
InputStream in = Display.getInstance().getResourceAsStream(null, textFile);
if (in != null){
try {
text = com.codename1.io.Util.readToString(in);
in.close();
} catch (IOException ex) {
System.out.println(ex);
text = "Read Error";
}
}
I even tried
text = com.codename1.io.Util.readToString(in,"UTF-8");
and
DataInputStream dis = new DataInputStream(in);
text = com.codename1.io.Util.readUTF(dis);
But I am not Unicode is not getting read.
For writing I am doing,
String content = "Some Unicode String";
OutputStream stream = fs.openOutputStream(path + "/" + fileName);
stream.write(content.getBytes());
stream.close();
and tried,
DataOutputStream dos = new DataOutputStream(stream);
dos.writeUTF(content);
I observed generated file is ANSI encode.
Update: Solution
As per #Shai's reply,
Read:
// For text file in package structure
InputStream in = Display.getInstance().getResourceAsStream(null, "/" + textFile);
// For file in file system
InputStream in = fs.openInputStream(textFile);
if (in != null) {
try {
text = com.codename1.io.Util.readToString(in, "UTF-8"); // Encoding
in.close();
} catch (IOException ex) {
text = "Read Error";
}
}
Write:
OutputStream stream = fs.openOutputStream(textFile);
stream.write(content.getBytes("UTF-8"));
stream.close();
The readToString() method reads with UTF-8 encoding. If you encoded the file in one of the ASCII/ANSI encoding you need to either fix it for UTF-8 or specify the specific encoding to that method.
readUTF from DataInputStream is something completely different designed for encoded streams and not for text files. DataInputStream in general is not designed for text files in Java, you should be using Reader/InputStreamReader for that sort of stuff.
getBytes() uses the platform specific encoding which is rarely what you want you should use getBytes(String).
How to read data of each cell in Selenium WebDriver using Java?
I have tried with following code:
CSVReader reader = new CSVReader(new FileReader("D:/data1.csv"));
expectedLabels = reader.readNext();
FieldNames = reader.readNext();
FieldValues = reader.readNext();
File file = new File("D:/data.csv");
if(file.exists()){
System.out.println("File Exists");
}
BufferedReader bufRdr;
bufRdr = new BufferedReader(new FileReader(file));
String line = null;
while((line = bufRdr.readLine()) != null){
StringTokenizer st = new StringTokenizer(line,",");
col=0;
while (st.hasMoreTokens()){
//Following stroing the data of csv
numbers[row][col] = st.nextToken();
col++;
}
System.out.println();
row++;
}
bufRdr.close();
reading csv files become very easy with OPENCSV jar file. I have used this jar file couple of time to read file.
We have predefined class called CSVReader, create an object and pass the csv file path
call readAll() method which will return the csv content in List
using Iterator, you can iterate all the values and use according to application
I have written article on this have a look it might help you.
http://learn-automation.com/how-to-read-csv-files-using-java/
I am not able to provider string type variable in FileReader() function , it shows error while passing filereader() method with parameter in buffer reader fn
code shown below:
String f1= (System.getProperty("User.dir") + "\\Module9TestNG\\src\\TestLogin.xlsx");
BufferedReader bufRdr;
bufRdr = new BufferedReader(new FileReader(file));
String record;
String url= null;
while ((record = bufRdr.readLine()) != null)
{
String fields[] = record.split(",");
url= fields[0].toString();
}
private String Fpath ="D:\\CkycApps.csv";
String line;
File file = new File(Fpath);
BufferedReader bufRdr;
bufRdr = new BufferedReader(new FileReader(file));
while((line = bufRdr.readLine()) != null){
System.out.println(line);
String[] cell= line.split(",");
String FirstName=cell[0];
String MiddleName=cell[1];
}