combine string in python pandas

combine string in python pandas - arrays

I got a problem when analyzing dataset about combining string together.
The data frame looks like the one below:
IP Event
01 check
01 redo
01 view
02 check
02 check
03 review
04 delete
As you can see, the IP contains duplicates. My question is, how can I get the results of combining the Event group by each IP in order.For example, the result I'm looking for is:
IP result
01 check->redo->view
02 check->check
03 review
04 delete

try this:
In [27]: df.groupby('IP').agg('->'.join).reset_index()
Out[27]:
IP Event
0 01 check->redo->view
1 02 check->check
2 03 review
3 04 delete
or
In [26]: df.groupby('IP').agg('->'.join)
Out[26]:
Event
IP
01 check->redo->view
02 check->check
03 review
04 delete

Try this with lambda:
df.groupby("IP")['Event'].apply(lambda x: '->'.join(x)).reset_index()
# IP Event
# 0 1 check->redo->view
# 1 2 check->check
# 2 3 review
# 3 4 delete

Related

How to create an array with leading zeros in Bash?

Not very difficult:
#!/bin/bash
hr=(00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23)
for i in ${hr[#]}; do
echo ${hr[i]}
done
But:
user#userver:$ ./stat.sh
00
01
02
03
04
05
06
07
./stat.sh: string 7: 08: value too great for base (error token is "08")
Bash thinks that leading zero means octal number system. What to do?

I think your fundamental problem is that you're using each element of the array to index the array itself.
If you want to just print out the array elements, you should be outputting ${i} rather than ${hr[i]}. The latter is useless in your current code since element zero is 00, element one is 01, and so on.
On the chance that this is a simplified example and you do want to reference a different array based on the content of this one, you have a couple of options:
Realise that the value of an integer and the presentation of it are two distinct things. In other words, use 0, 1, 2, ... but something like printf "%02d" $i if you need to output it (noting this is only ouputting the value in the first array, not the one you're looking up things in).
Exclusively use strings and a string-based associative array rather than an integer-based one, see typeset -A for detail.

Use brace expansion (ranges, repetition).
To create an array:
hours=({00..23})
Loop through it:
for i in "${hours[#]}"; do
echo "$i"
done
Or loop through a brace expansion:
for i in {00..23}; do
echo "$i"
done
Regarding your error, it's because in bash arithmetic, all numbers with leading zeroes are treated as octals, and 08 and 09 are invalid octal numbers. All indexed array subscripts are evaluated as arithmetic expressions. You can fix the problem by using the notation base#number to specify a number system. So for base 10: 10#09, or for i=09, 10#$i. The variable must be prefixed with $, 10#i does not work.
You should be printing your array like this anyway:
Loop through elements:
for i in "${hr[#]}"; do
echo "$i"
done
Loop through indexes:
for i in "${!hr[#]}"; do
echo "index is $i"
echo "element is ${hr[i]}"
done
If you need to do arithmetic on the hours, or any zero padded number, you will lose the zero padding. You can print it again with printf: printf %.2d "$num", where 2 is the minimum width.

When you iterate with:
for i in ${hr[#]}; do
It is iterating the values of the array witch are:
00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23
But when within the loop it has:
echo ${hr[i]}
it is using i as the index of the hr array.
In Bash, an index within the brackets of an array like [i] is an arithmetic context. It means while the value of i=08 the leading 0 within the arithmetic context causes the number to be treated as an octal number, and 8 is an invalid octal number.
If you wanted to iterate your array indexes to process its values by index, then you'd start the loop as:
for i in "${!hr[#]}"; do
This one will perfectly work as it iterates the index into the variable i :
#!/usr/bin/env bash
declare -a hr=(00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23)
for i in "${!hr[#]}"; do
printf '%s\n' "${hr[i]}"
done
Now if all you want is iterate the values of the hr array, just do this way:
#!/usr/bin/env bash
declare -a hr=(00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23)
for e in "${hr[#]}"; do
printf '%s\n' "$e"
done
No need to index the array within the loop, since the elements are already expanded into e.

You can specify it's 10-based :
echo ${hr[10#$i]}

Cobol - Date loop

How can I implement a loop that will display subsequent months (01/01/2018, 01/02/2018 etc.) x-times? Additionally, how can I set day,month,year as a variable? By the way, I'm new to Cobol.
This is my code I wrote so far
01 YYYYMMDD Pic 9(8).
01 Integer-Form Pic S9(9).
Move Function Current-Date(1:8) to YYYYMMDD
Compute Integer-Form = Function Integer-of-Date(YYYYMMDD)
Add 12 to Integer-Form
Compute YYYYMMDD = Function Date-of-Integer(Integer-Form)
Display 'Date: ' YYYYMMDD.
EDIT!
PERFORM VARYING Number-Periods FROM 0 BY 1 UNTIL Number-Periods > 36
DISPLAY ws-current-day, "/", ws-current-month, "/", ws-current-year
ADD 1 TO WS-current-month
IF ws-current-month > 12 THEN
COMPUTE ws-current-month = 1
ADD 1 TO WS-current-year
END-IF
END-PERFORM

Best to use redefines and 88 to check month:
01 YYYYMMDD PIC 9(8).
01 YYYYMMDD-R REDEFINES YYYYMMDD.
05 YYYY PIC 9(4).
05 MM PIC 9(2).
88 VALID-MONTH VALUE 1 2 3 4 5 6
7 8 9 10 11 12.
05 DD PIC 9(2).
Then you can use the 88 in the check (within the PERFORM VARYING as you discovered):
IF NOT VALID-MONTH
... increment year

LIBMODBUS: Writing to a double register?

Is there a way I can write one value to a double register using LIBMODBUS? For example writing value 100,000 to be spread across one register. Currently using modbus_write_registers to write 10,000 I am sending the modbus message
rc = modbus_write_registers(ctx, 4, 2, tab_reg); (Where tab_reg[0] = 10,000 and tab_reg[1] = 0)
0A 10 00 04 00 02 04 27 10 00 00 DC 09
Ideally the message i believe I would like to send would not send the 00 00 for the zero value. Is this possible to utilise using Libmodbus?
NB - I have also attempted using modbus_write_register() and this produced a much longer message so I am inclined to believe write registerS is the way to go.

Bash parse dynamic arrays and store specific values in an other

I am sending a dbus-send command which returns something like :
method return sender=:1.833 -> dest=:1.840 reply_serial=2
array of bytes [
00 01 02 03 04 05
]
int 1
boolean true
The "array of bytes" size is dynamic an can contains n values.
I store the result of the dbus-send command in an array by using :
array=($(dbus-send --session --print-repl ..readValue))
I want to be able to retrieve the values contained into the array of bytes and be able to display one or all of them if necessary like this :
data read => 00 01 02 03 04 05
or
first data read => 00
First data is always reachable by {array[10]} and I think is it possible to use a structure like :
IFS=" " read -a array
for element in "${array[#]:10}"
do
...
done
Any thoughts on how to accomplish this?

You really should use some library for dbus, like Net::DBus or something similar.
Anyway, for the above example you could write:
#fake dbus-send command
dbus-send() {
cat <<EOF
method return sender=:1.833 -> dest=:1.840 reply_serial=2
array of bytes [
00 01 02 03 04 05
]
int 1
boolean true
EOF
}
array=($(dbus-send --session --print-repl ..readValue))
data=($(echo "${array[#]}" | grep -oP 'array\s*of\s*bytes\s*\[\s*\K[^]]*(?=\])'))
echo "ALL data ==${data[#]}=="
echo "First item: ${data[0]}"
echo "All items as lines"
printf "%s\n" "${data[#]}"
data=($(echo "${array[#]}" | sed 's/.*array of bytes \[\([^]]*\)\].*/\1/'))
echo "ALL data ==${data[#]}=="
echo "First item: ${data[0]}"
echo "All items as lines"
printf "%s\n" "${data[#]}"
for the both example prints
ALL data ==00 01 02 03 04 05==
First item: 00
All items as lines
00
01
02
03
04
05

How can I process this array without huge amounts of code?

I'll start by admitting this is for my homework and I don't expect anything specific just a tip perhaps. The input file is just one 30 byte field that contains names. The output file is two fields 30 bytes each. I'll list the code to show what I mean by this. The program needs to read the input file putting the names into an array and then print them to the two fields in the output file. It would be simple enough if the out put file was like this:
name 1 name 2
name 3 name 4
etc...
but it's supposed to be:
name 1 name 55
name 2 name 56
....
name 54 name 108
I'm not quite understanding how I can code the program to do this without having 54 lines of code (1 for each line in the output). Well here's the code I have so far:
ENVIRONMENT DIVISION.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
SELECT NAMELIST-FILE-IN
ASSIGN TO 'NAMELIST.SEQ'
ORGANIZATION IS LINE SEQUENTIAL.
SELECT NAMELIST-FILE-OUT
ASSIGN TO 'NAMELIST.RPT'
ORGANIZATION IS LINE SEQUENTIAL.
DATA DIVISION.
FILE SECTION.
FD NAMELIST-FILE-IN.
01 NAME-IN PIC X(30).
FD NAMELIST-FILE-OUT.
01 NAME-OUT PIC X(60).
WORKING-STORAGE SECTION.
01 ARE-THERE-MORE-RECORDS PIC X(3) VALUE 'YES'.
01 PAGE-CTR PIC 99 VALUE ZERO.
01 SUB PIC 999 VALUE 1.
01 LEFT-NAME PIC 99 VALUE 54.
01 RIGHT-NAME PIC 9(3) VALUE 108.
01 WS-DATE.
05 RUN-YEAR PIC XX.
05 RUN-MONTH PIC XX.
05 RUN-DAY PIC XX.
01 HEADING-LINE.
05 PIC X(15) VALUE SPACES.
05 PIC X(20)
VALUE 'NAME LIST REPORT'.
05 HL-DATE.
10 DAY-HL PIC XX.
10 PIC X VALUE '/'.
10 MONTH-HL PIC XX.
10 PIC X VALUE '/'.
10 YEAR-HL PIC XX.
05 PIC X(3) VALUE SPACES.
05 PIC X(5) VALUE 'PAGE'.
05 PAGE-NUMBER-HL PIC Z9 VALUE 1.
01 DETAIL-LINE.
05 NAME-LEFT PIC X(30).
05 NAME-RIGHT PIC X(30).
01 NAME-ARRAY.
05 NAME-X OCCURS 108 PIC X(30).
PROCEDURE DIVISION.
100-MAIN.
OPEN INPUT NAMELIST-FILE-IN
OPEN OUTPUT NAMELIST-FILE-OUT
ACCEPT WS-DATE FROM DATE.
MOVE RUN-MONTH TO MONTH-HL
MOVE RUN-DAY TO DAY-HL
MOVE RUN-YEAR TO YEAR-HL
PERFORM 200-ACCEPT-INPUT
CLOSE NAMELIST-FILE-IN
CLOSE NAMELIST-FILE-OUT
STOP RUN.
200-ACCEPT-INPUT.
PERFORM UNTIL SUB > 108
READ NAMELIST-FILE-IN
MOVE NAME-IN TO NAME-X (SUB)
ADD 1 TO SUB
END-PERFORM
PERFORM 300-PRINT-ONE-PAGE.
300-PRINT-ONE-PAGE.
WRITE NAME-OUT FROM HEADING-LINE
AFTER ADVANCING PAGE
ADD 1 TO PAGE-CTR
Here's the final code for this program if anyone is interested in seeing it.
ENVIRONMENT DIVISION.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
SELECT NAMELIST-FILE-IN
ASSIGN TO 'NAMELIST.SEQ'
ORGANIZATION IS LINE SEQUENTIAL.
SELECT NAMELIST-FILE-OUT
ASSIGN TO 'NAMELIST.RPT'
ORGANIZATION IS LINE SEQUENTIAL.
DATA DIVISION.
FILE SECTION.
FD NAMELIST-FILE-IN.
01 NAME-IN PIC X(30).
FD NAMELIST-FILE-OUT.
01 NAME-OUT PIC X(80).
WORKING-STORAGE SECTION.
01 ARE-THERE-MORE-RECORDS PIC X(3) VALUE 'YES'.
01 PAGE-CTR PIC 99 VALUE ZERO.
01 SUB PIC 999.
01 SUB2 PIC 999.
01 LEFT-NAME PIC 99 VALUE 54.
01 RIGHT-NAME PIC 9(3) VALUE 108.
01 WS-DATE.
05 RUN-YEAR PIC XX.
05 RUN-MONTH PIC XX.
05 RUN-DAY PIC XX.
01 HEADING-LINE.
05 PIC X(26) VALUE SPACES.
05 PIC X(35)
VALUE 'NAME LIST REPORT'.
05 HL-DATE.
10 DAY-HL PIC XX.
10 PIC X VALUE '/'.
10 MONTH-HL PIC XX.
10 PIC X VALUE '/'.
10 YEAR-HL PIC XX.
05 PIC X(3) VALUE SPACES.
05 PIC X(5) VALUE 'PAGE'.
05 PAGE-NUMBER-HL PIC Z9.
01 DETAIL-LINE.
05 NAME-LEFT PIC X(40).
05 NAME-RIGHT PIC X(40).
01 NAME-ARRAY.
05 NAME-X OCCURS 108 PIC X(30).
PROCEDURE DIVISION.
100-MAIN.
OPEN INPUT NAMELIST-FILE-IN
OPEN OUTPUT NAMELIST-FILE-OUT
ACCEPT WS-DATE FROM DATE.
MOVE RUN-MONTH TO MONTH-HL
MOVE RUN-DAY TO DAY-HL
MOVE RUN-YEAR TO YEAR-HL
PERFORM UNTIL ARE-THERE-MORE-RECORDS = 'NO'
PERFORM 200-ACCEPT-INPUT
END-PERFORM
CLOSE NAMELIST-FILE-IN
CLOSE NAMELIST-FILE-OUT
STOP RUN.
200-ACCEPT-INPUT.
INITIALIZE NAME-ARRAY
MOVE 1 TO SUB
PERFORM UNTIL SUB > 108 OR ARE-THERE-MORE-RECORDS = 'NO'
READ NAMELIST-FILE-IN
AT END
MOVE 'NO' TO ARE-THERE-MORE-RECORDS
MOVE SPACES TO NAME-IN
END-READ
MOVE NAME-IN TO NAME-X (SUB)
ADD 1 TO SUB
END-PERFORM
PERFORM 300-PRINT-ONE-PAGE.
300-PRINT-ONE-PAGE.
ADD 1 TO PAGE-CTR
MOVE PAGE-CTR TO PAGE-NUMBER-HL
WRITE NAME-OUT FROM HEADING-LINE
AFTER ADVANCING PAGE
MOVE SPACES TO DETAIL-LINE
WRITE NAME-OUT FROM DETAIL-LINE
AFTER ADVANCING 1
PERFORM VARYING SUB FROM 1 BY 1 UNTIL SUB > 54
MOVE NAME-X (SUB) TO NAME-LEFT
COMPUTE SUB2 = SUB + 54
MOVE NAME-X (SUB2) TO NAME-RIGHT
WRITE NAME-OUT FROM DETAIL-LINE
AFTER ADVANCING 1
END-PERFORM.

I must apologize, I cannot think of a way to guide you without giving away the answer. I guess this is a spoiler alert.
One possible method you could use would be to add a variable SUB2 to Working-Storage and...
Perform Varying SUB From 1 By 1 Until SUB > 54
Move NAME-X(SUB) to NAME-LEFT
Compute SUB2 = SUB + 54
MOVE NAME-X(SUB2) to NAME-RIGHT
Write NAME-OUT from DETAIL-LINE After Advancing 1 Line
End-Perform
This is kind of kludgy and ties you to an array of 108 elements. You could use a record counter that you increment by 1 for each record read and then compute the values I show hardcoded in the sample above (54 is simply half of 108).
I don't show the page break logic, so perhaps I didn't give the whole show away.
I hope this helps you.

I would have 2 arrays.
One containing the whole file.
01 DETAIL-LINE-ARRAY.
02 DETAIL-LINE OCCURS 54.
05 NAME-LEFT PIC X(30).
05 NAME-RIGHT PIC X(30).
Another like you did with NAME-ARRAY
Then I would populate first the DETAIL-LINE-ARRAY.
I would read twice DETAIL-LINE-ARRAY to fill NAME-ARRAY
Then you can read sequentially NAME-ARRAY
==========================================================================
Another solution:
You can read the file twice. While the first read, you take care only of the left name and populate NAME-ARRAY.
While the second reading, you take care only of the right name and continue to populate NAME-ARRAY.
After the first and second read, you can read your array NAME-ARRAY.
==========================================================================
There is a problem with both last solutions : you have to know how much lines contains your file. Yep, you can have dynamic programming in cobol too :-)
You have to use SORT.
In FILE SECTION add
SD SORT-WORK
01 SORT-RECORD.
05 SR-ORDER PIC 9(09).
05 SR-NAME PIC X(30).
In your PROCEDURE DIVISION add (in pseude-code, you'll need to add some variables in your working storage section.
SORT SORT-WORK
ASCENDING SORT-ORDER
INPUT PROCEDURE IS 1000-INPUT
OUTPUT PROCEDURE IS 2000-OUTPUT
1000-INPUT SECTION.
MOVE 0 TO I.
PERFORM until end-of-file of NAMELIST-FILE-IN
ADD 1 TO I
READ left-name
MOVE I TO SORT-ORDER
MOVE left-name TO SR-NAME
* This operation populates the sort file...
RELEASE SORT-RECORD
END-PERFORM.
PERFORM until end-of-file of NAMELIST-FILE-IN
ADD 1 TO I
READ right-name
MOVE I TO SORT-ORDER
MOVE right-name TO SR-NAME
* This operation populates the sort file...
RELEASE SORT-RECORD
END-PERFORM.
MOVE I TO WS-NB-NAMES.
2000-OUTPUT SECTION.
PERFORM VARYING I FROM 1 BY 1 UNTIL I > WS-NB-NAMES
* This operation returns the "next" record. It begins by the first, second...
RETURN SORT-RECORD
MOVE SR-NAME TO NAME-LEFT
RETURN SORT-RECORD
MOVE SR-NAME TO NAME-RIGHT
WRITE NAMELIST-FILE-OUT FROM DETAIL-LINE
END-PERFORM.
You have some example here for SORT
Have fun !

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

combine string in python pandas - arrays

try this: In [27]: df.groupby('IP').agg('->'.join).reset_index() Out[27]: IP Event 0 01 check->redo->view 1 02 check->check 2 03 review 3 04 delete or In [26]: df.groupby('IP').agg('->'.join) Out[26]: Event IP 01 check->redo->view 02 check->check 03 review 04 delete

Try this with lambda: df.groupby("IP")['Event'].apply(lambda x: '->'.join(x)).reset_index() # IP Event # 0 1 check->redo->view # 1 2 check->check # 2 3 review # 3 4 delete

Related

How to create an array with leading zeros in Bash?

Cobol - Date loop

LIBMODBUS: Writing to a double register?

Bash parse dynamic arrays and store specific values in an other

How can I process this array without huge amounts of code?

Categories

Resources