C reading csv file

C reading csv file - c

I'm running into a problem I haven't encountered before and am baffled... for some reason when I try to read a CSV file char by char but it seems like spaces are somehow getting placed there... and what's weirder is the fact that no space chars exist anywhere. I will give an example...
char *readgd(const char *fname)
{
char *gddata, *tmp;
FILE *fp;
int buff = 1024, c = 0, ch;
if(!(fp = fopen(fname, "r")))
{
printf("\nError! Could not open %s!", fname);
return 0x00;
}
if(!(gddata = malloc(buff)))
{
fclose(fp);
printf("\nError! Memory allocation failed!");
return 0x00;
}
while(ch != EOF)
{
c++;
ch = fgetc(fp);
if(buff <= c)
{
buff += buff;
if(!(tmp = realloc(gddata, buff)))
{
free(gddata);
fclose(fp);
printf("\nError! Memory allocation failed!");
}
gddata = tmp;
}
gddata[c - 1] = ch;
if(gddata[c - 1] != ' ') printf("%c", gddata[c - 1]); //no spaces?
}
if(!(tmp = realloc(gddata, c + 1)))
{
free(gddata);
fclose(fp);
printf("\nError! Memory allocation failed!");
}
gddata = tmp;
gddata[c] = 0x00;
fclose(fp);
return gddata;
}
with the following CSV snippet:
:Tagname,Area,SecurityGroup,Container,ContainedName,ShortDesc,ExecutionRelativeOrder,ExecutionRelatedObject,UDAs,Extensions,CmdData,Address_ACbHAlmCfg,Address_ACbHWarnCfg,Address_ACbLAlmCfg,Address_ACbLWarnCfg,Address_ACbTfCfg,Address_ACrHAlmDb,Address_ACrHAlmSp,Address_ACrHAlmTmrSp,Address_ACrHWarnDb,Address_ACrHWarnSp,Address_ACrHWarnTmrSp,Address_ACrLAlmDb,Address_ACrLAlmSp,Address_ACrLAlmTmrSp,Address_ACrLWarnDb,Address_ACrLWarnSp,Address_ACrLWarnTmrSp,Address_ACrTfTmrSp,Address_bHalm,Address_bHWarn,Address_bLAlm,Address_bLwarn,Address_bMode,Address_bTfAlm,Address_rCCmd,Address_rVal,
outputs this onto the console:
 
■: T a g n a m e , A r e a , S e c u r i t y G r o u p , C o n t a i n e r , C
o n t a i n e d N a m e , S h o r t D e s c , E x e c u t i o n R e l a t i v e
O r d e r , E x e c u t i o n R e l a t e d O b j e c t , U D A s , E x t e n s
i o n s , C m d D a t a , A d d r e s s _ A C b H A l m C f g , A d d r e s s _
A C b H W a r n C f g , A d d r e s s _ A C b L A l m C f g , A d d r e s s _ A
C b L W a r n C f g , A d d r e s s _ A C b T f C f g , A d d r e s s _ A C r H
A l m D b , A d d r e s s _ A C r H A l m S p , A d d r e s s _ A C r H A l m T
m r S p , A d d r e s s _ A C r H W a r n D b , A d d r e s s _ A C r H W a r n
S p , A d d r e s s _ A C r H W a r n T m r S p , A d d r e s s _ A C r L A l m
D b , A d d r e s s _ A C r L A l m S p , A d d r e s s _ A C r L A l m T m r S
p , A d d r e s s _ A C r L W a r n D b , A d d r e s s _ A C r L W a r n S p ,
A d d r e s s _ A C r L W a r n T m r S p , A d d r e s s _ A C r T f T m r S p
, A d d r e s s _ b H a l m , A d d r e s s _ b H W a r n , A d d r e s s _ b L
A l m , A d d r e s s _ b L w a r n , A d d r e s s _ b M o d e , A d d r e s s
_ b T f A l m , A d d r e s s _ r C C m d , A d d r e s s _ r V a l ,
I am very confused as to where these spaces are coming from. Any help would be greatly appreciated.

Are you sure the CSV is not encoded with UTF-16 (using two bytes per character)?
This is the most likely reason you'd see spaces between otherwise valid ASCII characters, so try verifying the encoding first.

Related

How to use arrays in sas?

How to perform calculation using arrays in SAS?
source file scholar
Anne C A C D B E D D B A
Vicky C C C E B E D B A
Laurel D D C D B E D D B A
Victor C A C D B E D D A D
Dimple C A C D B E D D B A
Godfrey B D C B D D D B B A
Denny C D C B E E D B B A
Richard C A C D B E D D B A

Try this
data have;
input name $ (q1 - q10)(:$1.);
infile datalines missover;
datalines;
Anne C A C D B E D D B A
Vicky C C C E B E D B A
Laurel D D C D B E D D B A
Victor C A C D B E D D A D
Dimple C A C D B E D D B A
Godfrey B D C B D D D B B A
Denny C D C B E E D B B A
Richard C A C D B E D D B A
;
data want;
set have;
array a {10} $ _temporary_ ("C", "A", "C", "D", "B", "E", "D", "D", "B", "A");
array q q1 - q10;
total = 0;
do over q;
if q = a[_I_] then total + 1;
end;
Result = ifc(total ge 7, "Passed", "Failed");
run;
Result:
Obs name q1 ------------q10 total Result
1 Anne C A C D B E D D B A 10 Passed
2 Vicky C C C E B E D B A 5 Failed
3 Laurel D D C D B E D D B A 8 Passed
4 Victor C A C D B E D D A D 8 Passed
5 Dimple C A C D B E D D B A 10 Passed
6 Godfrey B D C B D D D B B A 4 Failed
7 Denny C D C B E E D B B A 6 Failed
8 Richard C A C D B E D D B A 10 Passed

Not sure why, but there u go
data want;
set have;
array a {10} $ _temporary_ ("C", "A", "C", "D", "B", "E", "D", "D", "B", "A");
array t {10} _temporary_ (10 * 0);
array q q1 - q10;
do over q;
if q = a[_I_] then t[_I_] = 1;
end;
total = sum(of t[*]);
Result = ifc(total ge 7, "Passed", "Failed");
call stdize('replace', 'mult=', 0, of t[*], _N_);
run;

I suspect you want to use simpler constructs like below:
data want;
set have;
array a {10} $ _temporary_ ("C", "A", "C", "D", "B", "E", "D", "D", "B", "A");
array correct_answer(10) correct_answer1-correct_answer10 ;
array q q1 - q10;
do i=1 to dim(a);
if q = a[_I_] then correct_answer = 1;
else correct_answer=0;
end;
Total = sum(of correct_answer1-correct_answer10);
if total>= 7 then result="Passed";
else result ="Failed";
run;

Join four columns into one according to each row

A B C D
E F G H
I J K L
M N O P
If I chose to join the columns I would ={A1:A;B1:B;C1:C;D1:D} but it would look like this:
A
E
I
M
B
F
J
N
... and so on
I would like it to look like this:
A
B
C
D
E
F
G
... and so on
How to proceed in this case?
Note: It may happen that some of the columns are not complete in data, some may have more values than the others, but I still want to continue following this same pattern. Example:
A B D
E G H
I J K L
M N O P
Result:
A
B
D
E
G
H
... and so on

use:
=TRANSPOSE(QUERY(TRANSPOSE(A:D),, 9^9))
then:
=TRANSPOSE(SPLIT(QUERY(TRANSPOSE(QUERY(TRANSPOSE(A:D),,9^9)),,9^9), " "))

Remove spaces between characters of output Batch file

I have a batch file that outputs the Wi-Fi adapter MAC address to a text file using the command wmic nic where "name like '%%802%%'" get name,macaddress and there are spaces between characters. I get the same thing for sfc /scannow in a batch file.
The output of the wmic command is:
Hostname: SOME-COMPUTER
A C A d d r e s s N a m e
3 4 : F 6 : B 3 : J 3 : 6 3 : 1 3 B r o a d c o m 8 0 2 . 1 1 n N e t w o r k A d a p t e r
The snippet of output of the sfc scan is:
n 3 0 % c o m p l e t e . V e r i f i c a t i o n 3 1 % c o m p l e t e . V e r i f i c a t i o n 3 1 % c o m p l e t e . V e r i f i c a t i o n 3 2 % c o m p l e t e . V e r i f i c a t i o n 3 2 % c o m p l e t e . V e r i f i c a t i o n 3 3 % c o m p l e t e . V e r i f i c a t i o n 3 3 % c o m p l e t e . V e r i f i c a t i o n 3 4 % c o m p l e t e . V e r i f i c a t i o n 3 4 % c o m p l e t e . V e r i f i c a t i o n 3 5 % c o m p l e t e . V e r i f i c a t i o n 3 5 % c o m p l e t e . V e r i f i c a t i o n 3 5 % c o m p l e t e . V e r i f i c a t i o n 3 6 % c o m p l e t e . V e r i f i c a t i o n 3 6 % c o m p l e t e . V e r i f i c a t i o n 3 7 % c o m p l e t e . V e r i f i c a t i o n 3 7 % c o m p l e t e . V e r i f i c a t i o n 3 8 % c o m p l e t e . V e r i f i c a t i o n 3 8 % c o m p l e t e . V e r i f i c a t i o n 3 9 % c o m p l e t e . V e r i f i c a t i o n 3 9 % c o m p l e t e . V e r i f i c a t i o n 4 0 % c o m p l e t e . V e r i f i c a t i o n 4 0 % c o m p l e t e . V e r i f i c a t i o n 4 1 % c o m p l e t e . V e r i f i c a t i o n 4 1 % c o m p l e t e . V e r i f i c a t i o n 4 2 % c o m p l e t e . V e r i f i c a t i o n 4 2 % c o m p l e t e . V e r i f i c a t i o n 4 2 % c o m p l e t e . V e r i f i c a t i o n 4 3 % c o m p l e t e . V e r i f i c a t i o n 4 3 % c o m p l e t e . V e r i f i c a t i o n 4 4 % c o m p l e t e . V e r i f i c a t i o n 4 4 % c o m p l e t e . V e r i f i c a t i o n 4 5 % c o m p l e t e . V e r i f i c a t i o n 4 5 % c o m p l e t e . V e r i f i c a t i o n 4 6 % c o m p l e t e . V e r i f i c a t i o n 4 6 % c o m p l e t e . V e r i f i c a t i o n 4 7 % c o m p l e t e . V e r i f i c a t i o n 4 7 % c o m p l e t e . V e r i f i c a t i o n 4 8 % c o m p l e t e . V e r i f i c a t i o n 4 8 % c o m p l e t e . V e r i f i c a t i o n 4 9 % c o m p l e t e . V e r i f i c a t i o n 4 9 % c o m p l e t e . V e r i f i c a t i o n 5 0 % c o m p l e t e . V e r i f i c a t i o n 5 0 % c o m p l e t e . V e r i f i c a t i o n 5 0 % c o m p l e t e . V e r i f i c a t i o n 5 1 % c o m p l e t e . V e r i f i c a t i o n 5 1 % c o m p l e t e . V e r i f i c a t i o n 5 2 % c o m p l e t e . V e r i f i c a t i o n 5 2 % c o m p l e t e . V e r i f i c a t i o n 5 3 % c o m p l e t e . V e r i f i c a t i o n 5 3 % c o m p l e t e . V e r i f i c a t i o n 5 4 % c o m p l e t e . V e r i f i c a t i o n 5 4 % c o m p l e t e . V e r i f i c a t i o n 5 5 % c o m p l e t e . V e r i f i c a t i o n 5 5 % c o m p l e t e . V e r i f i c a t i o n 5 6 % c o m p l e t e . V e r i f i c a t i o n 5 6 % c o m p l e t e . V e r i f i c a t i o n 5 7 % c o m p l e t e . V e r i f i c a t i o n 5 7 % c o m p l e t e . V e r i f i c a t i o n 5 7 % c o m p l e t e . V e r i f i c a t i o n 5 8 % c o m p l e t e . V e r i f i c a t i o n 5 8 % c o m p l e t e . V e r i f i c a t i o n 5 9 % c o m p l e t e . V e r i f i c a t i o n 5 9 % c o m p l e t e . V e r i f i c a t i o n 6 0 % c o m p l e t e . V e r i f i c a t i o n 6 0 % c o m p l e t e . V e r i f i c a t i o n 6 1 % c o m p l e t e . V e r i f i c a t i o n 6 1 % c o m p l e t e . V e r i f i c a t i o n 6 2 % c o m p l e t e . V e r i f i c a t i o n 6 2 % c o m p l e t e . V e r i f i c a t i o n 6 3 % c o m p l e t e . V e r i f i c a t i o n 6 3 % c o m p l e t e . V e r i f i c a t i o n 6 4 % c o m p l e t e . V e r i f i c a t i o n 6 4 % c o m p l e t e . V e r i f i c a t i o n 6 4 % c o m p l e t e . V e r i f i c a t i o n 6 5 % c o m p l e t e . V e r i f i c a t i o n 6 5 % c o m p l e t e . V e r i f i c a t i o n 6 6 % c o m p l e t e . V e r i f i c a t i o n 6 6 % c o m p l e t e . V e r i f i c a t i o n 6 7 % c o m p l e t e . V e r i f i c a t i o n 6 7 % c o m p l e t e . V e r i f i c a t i o n 6 8 % c o m p l e t e . V e r i f i c a t i o n 6 8 % c o m p l e t e . V e r i f i c a t i o n 6 9 % c o m p l e t e . V e r i f i c a t i o n 6 9 % c o m p l e t e . V e r i f i c a t i o n 7 0 % c o m p l e t e . V e r i f i c a t i o n 7 0 % c o m p l e t e . V e r i f i c a t i o n 7 1 % c o m p l e t e . V e r i f i c a t i o n 7 1 % c o m p l e t e . V e r i f i c a t i o n 7 1 % c o m p l e t e . V e r i f i c a t i o n 7 2 %
When I run these commands from cmd.exe, the output looks like this:
MACAddress Name
34:F2:B3:A1:83:33 Broadcom 802.11n Network Adapter
I would use PowerShell normally, but that is not what I need here. I would like the output to look like this in the output file.
The output doesn't seem to care if I use setlocal enabledelayedexpansion endlocal, as the output is the same.
Thank you for your help!
EDIT
I created a blank Unicode file ( I have Windows 10 ), and copied it to a new file I used for the output file. Now, the spaces are gone, but for whatever reason, the variable created for the hostname outputs what appears to be Chinese characters:
潈瑳慮敭›䕄䭓佔ⵐ䩒䵔剄⁓਍MACAddress Name
34:F2:B3:A1:83:33 Broadcom 802.11n Network Adapter
The hostname command was used as follows:
for /f "delims= tokens=*" %%i in ('hostname') do (
set "nameHost=%%i"
)
Then, I changed tokens=* to tokens=2 and all of the characters were Chinese. Anyway, a work-around was to add the hostname to the filename instead, which resolves this issue. However, it would be ideal to learn how to have both ANSI and Unicode strings in the same file, without the spaces.
I saw a C++ article about this and the answer was to change the code; therefore, if that it true, I don't have the code for sfc or wmic, so am I SOL there, or is there a way to have both ANSI and Unicode strings in the same file without the spaces between each character?

wmic nic where "name like '%%802%%'" get name,macaddress|more
should provide you with an ANSI version - the problem there is Unicode-output from WMIC.

SaveToFile and then open it formating text weird

Okay so I am writing a program that imports a database to an text file Via (SaveToFile command) .But when I open the file normally it gives me.
TG! ¶’ò?²Ï# ª _þX g ÒcöëÏ°ã ª ? Á<Ž¶ëmÐö ª _þX
| ¾"µÈó\Îå ª Dw= ÿÿ† ÿÿ" I Á<Ž¶ëmÐö ª _þX 2 . " C l i e n t s " C l i e n t s + ð I D I D
ÿ Z ÿÿÿÿC ð S u r e n a m e S u r e n a m e ‚ ÿ ÿ j ÿÿC ð P a s s w o r d P a s s w o r d
ÿ z ÿÿ3 ð N a m e N a m e ‚ ÿ ÿ j ÿÿK ð
M o n e y P a i d
M o n e y P a i d ÿ z ÿÿK ð
M o n e y O w e d
M o n e y O w e d ÿ z ÿÿ[ ð O n c e O f f C l i e n t O n c e O f f C l i e n t ÿ ÿ Z ÿÿC ð P h o n e I D P h o n e I D
ÿ z ÿÿÿ a w e a w e Ó–I
Here is my code:
procedure TfrmRawDATA.btnStoreFeedClick(Sender: TObject);
var
StoreFeed : string;
StoreFeedFile: TextFile;
data : string;
begin
begin
if (FileExists('C:\Users\ASROCK\Desktop\IT-PAT 2014\PAT Fase 3\StoreFeedFile.txt')) then
begin
DeleteFile('C:\Users\ASROCK\Desktop\IT-PAT 2014\PAT Fase 3\StoreFeedFile.txt');
ShowMessage('Save file deleted!');
end
else
AssignFile(StoreFeedFile,'Test.txt');
FileSetAttr('C:\Users\ASROCK\Desktop\IT-PAT 2014\PAT Fase 3\StoreFeedFile.txt', faReadOnly);
dmMJCPlus.tblClients.SaveToFile('C:\Users\ASROCK\Desktop\IT-PAT 2014\PAT Fase 3\StoreFeedFile.txt');
end;
end;
I just wanna know how to set like the file type or something so it doesn't give me that text.

The default saving format for ADO table/query and ClientDataSet is binary. You have the option of using XML though. You need to specify it in the call to SaveToFile:
ClientDataSet.SaveToFile('...', dfXML);
or
ADOTable.SaveToFile('...', pfXML);
Having a file extension of '.xml' should achieve the same, looking at the source, though it would seem it didn't turn out to be the case for you (as you seem to have tried it in the comments).
pfXML/dfXML are defined in 'adodb.pas' and 'dbclient.pas' respectively.

How to load a sliding diagonal vector from data stored column-wise with SSE

The sliding diagonal vector contains 16 elements, each one an 8-bit unsigned integer.
Without SSE and a bit simplified it would have looked like this in C:
int width=1000000; // a big number
uint8_t matrix[width][16];
fill_matrix_with_interesting_values(&matrix);
for (int i=0; i < width - 16; ++i) {
uint8_t diagonal_vector[16];
for (int j=0; j<16; ++j) {
diagonal_vector[j] = matrix[i+j][j];
}
do_something(&diagonal_vector);
}
but in my case I can only load column-wise (vertically) from the matrix with the _mm_load_si128 intrinsics function. The sliding diagonal vector is moving horizontally so I need to load 16 column vectors in advance and use one element from each of those column vectors to create the diagonal vector.
Is it possible to make a fast low-memory implementation for this with SSE?
Update Nov 14 2016: Providing some more details. In my case I read single-letter codes from a text file in FASTA format. Each letter represents a certain amino acid. Each amino acid has a specific column vector associated with it. That column vector is looked up from a constant table (a BLOSUM matrix). In C code it would look like this
while (uint8_t c = read_next_letter_from_file()) {
column_vector = lookup_from_const_table(c)
uint8_t diagonal_vector[16];
... rearrange the values from the latest column
vectors into the diagonal_vector ...
do_something(&diagonal_vector)
}

The implementation I will present only needs one column load per iteration. First we initialize some variables
const __m128i mask1=_mm_set_epi8(0,0,0,0,0,0,0,0,255,255,255,255,255,255,255,255);
const __m128i mask2=_mm_set_epi8(0,0,0,0,255,255,255,255,0,0,0,0,255,255,255,255);
const __m128i mask3=_mm_set_epi8(0,0,255,255,0,0,255,255,0,0,255,255,0,0,255,255);
const __m128i mask4=_mm_set_epi8(0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255);
__m128i v0, v1, v2, v3, v4, v5, v6, v7, v8, v9, v10, v11, v12, v13, v14, v15;
Then for each step the variable v_column_load is loaded with the next column.
v15 = v_column_load;
v7 = _mm_blendv_epi8(v7,v15,mask1);
v3 = _mm_blendv_epi8(v3,v7,mask2);
v1 = _mm_blendv_epi8(v1,v3,mask3);
v0 = _mm_blendv_epi8(v0,v1,mask4);
v_diagonal = v0;
In the next step the variable name numbers in v0, v1, v3, v7, v15 are incremented by 1 and adjusted to be in the range 0 to 15. In other words: newnumber = ( oldnumber + 1 ) modulo 16.
v0 = v_column_load;
v8 = _mm_blendv_epi8(v8,v0,mask1);
v4 = _mm_blendv_epi8(v4,v8,mask2);
v2 = _mm_blendv_epi8(v2,v4,mask3);
v1 = _mm_blendv_epi8(v1,v2,mask4);
v_diagonal = v1;
After 16 iterations the v_diagonal will start to contain the correct diagonal values.
Looking at mask1,mask2, mask3, mask4, we see a pattern that can be used to generalize this algorithm for other vector lengths (2^n).
For instance, for vector length 8, we would only need 3 masks and the iteration steps would look like this:
v7 = a a a a a a a a
v6 =
v5 =
v4 =
v3 = a a a a
v2 =
v1 = a a
v0 = a
v0 = b b b b b b b b
v7 = a a a a a a a a
v6 =
v5 =
v4 = b b b b
v3 = a a a a
v2 = b b
v1 = a b
v1 = c c c c c c c c
v0 = b b b b b b b b
v7 = a a a a a a a a
v6 =
v5 = c c c c
v4 = b b b b
v3 = a a c c
v2 = a b c
v2 = d d d d d d d d
v1 = c c c c c c c c
v0 = b b b b b b b b
v7 = a a a a a a a a
v6 = d d d d
v5 = c c c c
v4 = b b d d
v3 = a a c d
v3 = e e e e e e e e
v2 = d d d d d d d d
v1 = c c c c c c c c
v0 = b b b b b b b b
v7 = a a a a e e e e
v6 = d d d d
v5 = a a c c e e
v4 = a b b d a
v4 = f f f f f f f f
v3 = e e e e e e e e
v2 = d d d d d d d d
v1 = c c c c c c c c
v0 = b b b b f f f f
v7 = a a a a e e e e
v6 = b b d d f f
v5 = a b c d e f
v5 = g g g g g g g g
v4 = f f f f f f f f
v3 = e e e e e e e e
v2 = d d d d d d d d
v1 = c c c c g g g g
v0 = b b b b f f f f
v7 = a a c c e e g g
v6 = a b c d e f g
v6 = h h h h h h h h
v5 = g g g g g g g g
v4 = f f f f f f f f
v3 = e e e e e e e e
v2 = d d d d h h h h
v1 = c c c c g g g g
v0 = b b d d f f h h
v7 = a b c d e f g h <-- this vector now contains the diagonal
v7 = i i i i i i i i
v6 = h h h h h h h h
v5 = g g g g g g g g
v4 = f f f f f f f f
v3 = e e e e i i i i
v2 = d d d d h h h h
v1 = c c e e g g i i
v0 = b c d e f g h i <-- this vector now contains the diagonal
v0 = j j j j j j j j
v7 = i i i i i i i i
v6 = h h h h h h h h
v5 = g g g g g g g g
v4 = f f f f j j j j
v3 = e e e e i i i i
v2 = d d f f h h j j
v1 = c d e f g h i j <-- this vector now contains the diagonal
Sidenote: I discovered this way of loading a diagonal vector when I was working on an implementation of the Smith-Waterman algorithm. Some more information can be found on the old SourceForge project web page.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

C reading csv file - c

Are you sure the CSV is not encoded with UTF-16 (using two bytes per character)? This is the most likely reason you'd see spaces between otherwise valid ASCII characters, so try verifying the encoding first.

Related

How to use arrays in sas?

Join four columns into one according to each row

Remove spaces between characters of output Batch file

SaveToFile and then open it formating text weird

How to load a sliding diagonal vector from data stored column-wise with SSE

Categories

Resources