Bulk insert fixed length file with ragged right lines - sql-server

I have a fixed length text file, except some of the lines end early, with a carriage return/line feed. I'm using a .fmt file.
Q: How do I tell SQL Server to use an empty string for the fields that are unaccounted for?
I should probably ask my client to pad his text file, but it would be easier to just process it with the lines that are terminated early.

You should write a pre-processor to condition the text file before doing the bulk insert.

Related

"Too few data elements" error in Knime CSV Reader

Receiving below error while execution of CSV file which includes around 400k rows
Error:
ERROR CSV Reader 2:1 Execute failed: Too few data elements (line: 2 (Row0), source: 'file:/Users/shobha.dhingra/Desktop/SBC:Non%20SBC/SBC.csv')
I have tried executing another csv file with few lines, did not face an issue.
It is not about the number of lines, but the content in the line (2 in your case). It seems your SBC.csv file is not correct, it has extra header content or the second line misses the commas representing the missing cells.
You can use the CSV Reader node's Support Short Lines option to let KNIME handle this case by producing missing cells.
I get this error when end-of-line characters exist in a field. You could load the file into a text editor and identify any look for non-printing characters (tabs, carriage returns etc) between your delimiters.
If you can't get a clean version of the file, consider using this regex
[^ -~] to identify any character that is not a space or a visible character.
I hope this helps.

Read matrices from multiple .csv files and print matrices in .csv files

So I have to write a C program to read data from .csv files supplied to me by multiple users, into matrices on which I will perform some operations (like matrix addition, multiplication with necessary conditions on dimensions, etc.) and print these matrices (or the output data) in to .csv files again.
I also need to dynamically allocate memory to my matrices.
Now, I have zero background in dealing with .csv files. I do not at all know the required code to read a .csv file or write into a .csv file. I have searched for long on the Internet but surprisingly I have not found any program that teaches how to deal with .csv files from the elementary level.
I am lost on this and need a lot of guidance, maybe a sample, fully well-written C program as I need a comprehensive example to begin with.
A CSV file is just a plain ASCII text file that contains a grid of values. Think of the file as a set of rows in a database table where each line in the file represents one record and the order of the data in each line is identical. Each item of data is separated using a comma character (hence the name). So to read the file:-
open file
until the end of the file
read line into a string
split the string into sub strings where ',' is the dilimiter
parse each sub string
Since there is no formatting information in a CSV file, if the data in each value consists of a string, then what do you do if the value has a comma in it? For reading numbers that is not a problem for you.
You could read the file in several passes, the first to determine the amount of data there is (number of columns, number of rows, etc) and the second to actually read the data.
Writing the CSV is quite simple:-
open file
for each record to write
for each element to write
write element
if not last element
write a comma
write a new line

How to read from a specific line from a text file in VHDL

I am doing a program in VHDL to read and write data. My program has to read data from a line, process it, and then save the new value in the old position. My code is somewhat like:
WRITE_FILE: process (CLK)
variable VEC_LINE : line;
file VEC_FILE : text is out "results";
begin
if CLK='0' then
write (VEC_LINE, OUT_DATA);
writeline (VEC_FILE, VEC_LINE);
end if;
end process WRITE_FILE;
If I want to read line 15, how can I specify that? Then I want to clear line 15 and have to write a new data there. The LINE is of access type, will it accept integer values?
Russell's answer - using two files - is the answer.
There isn't a good way to find the 15th line (seek) but for VHDL's purpose, reading and discarding the first 14 lines is perfectly adequate. Just wrap it in a procedure named "seek" and carry on!
If you're on the 17th line already, you can't seek backwards, or rewind to the beginning. What you can do is flush the output file (save the open line, copy the rest of the input file to it, close both files and reopen them. Naturally, this requires VHDL-93 not VHDL-87 syntax for file operations). Just wrap that in a procedure called "rewind", and carry on!
Keep track of the current line number, and now you can seek to line 15, wherever you are.
It's not pretty and it's not fast, but it'll work just fine. And that's good enough for VHDL's purposes.
In other words you can write a text editor in VHDL if you must, (ignoring the problem of interactive input, though reading stdin should work) but there are much better languages for the job. One of them even looks a lot like an object-oriented VHDL...
Use 2 files, an input file and an output file.
file_open(vectors, "stimulus/input_vectors.txt", read_mode);
file_open(results, "stimulus/output_results.txt", write_mode);
while not endfile(vectors) loop
readline(vectors, iline);
read(iline, a_in);
etc for all your input data...
write(oline, <output data>
end loop;
file_close(vectors);
file_close(results);

How do I insert data at the top of a CSV file?

How can I go back to the very beginning of a csv file and add rows?
(I'm printing to a CSV file from C using fprintf(). At the end of printing thousands of rows (5 columns) of data, I would like to go back to the top of the file and insert some dynamic header data (based on how things went printing everything). )
Thank You.
Due to the way files are structured, this is more or less impossible. In order to accomplish what you want:
write csv data to file1
write header to file2
copy contents of file1 to file2
delete file1
Or you can hold the csv data in ram and write it to file after you're finished processing and know the header.
Another option is to set aside a certain number of bytes for the header, which will work much faster for large files at minimal space cost. Since the space is allocated in the file at the start of the write, there aren't any issues going back and filling it in. Reopen the file as random access ("r+"), which points to the top of the file by default, write header, and close.
The simplest way would be to simply store the entire contents of the file in memory until you are finished, write out the header, and then write out the rest of the file.
If memory is an issue and you can't safely store the entire file in memory, or just don't want to, then you could write out the bulk of the CSV data to a temporary file, then when you are finished, write the header out to the primary file, and copy the data from the temporary file to the primary file in a loop.
If you wanted to be fancy, after writing the main CSV data out to the primary file, you could loop through the file from the beginning, read into memory the data that you're about to overwrite with the header, then write the header over top of that data, and so forth, read each chunk into memory, overwrite it with the previous one until you reach the end and append the final chunk. In this way you "insert" data at the beginning, my moving the rest of the file down. I really wouldn't recommend this as it will mostly just add complexity without much benefit, unless there is a specific reason you can't do something simpler like using a temporary file.
I think that is not possible. Probably the easiest way would be to write the output to a temporary file, then create the data you need as the dynamic header, write them to the target file and append the previously created temporary file.
write enough blank spaces in the first line
write data
seek(0)
write header - last column will be padded with spaces

Easy way to inspect BCP .dat file?

I'm getting BCP error "Unexpected EOF encountered in BCP data-file" during import, which is probably misleading. I strongly suspect that some field has been added to the table or there's some offending character is in the file.
How would I go about inspecting the contents of .dat file visually?
Are there any good hex viewers where I can quickly try to adjust row length to see the data in tabular manner?
Other suggestions are also appreciated.
I guess it depends on your input format. Is it binary input? if so, it's gonna be hard. I use visual studio to open a file in the binary viewer but it's far from easy. The usual suspects are CRLF's in a text field or text that contains your field delimiter or EOL character.

Resources