My table data has 5 columns and 5288 rows. I am trying to read that data into a CSV file adding column names. The code for that looks like this :
cursor = conn.cursor()
cursor.execute('Select * FROM classic.dbo.sample3')
rows = cursor.fetchall()
print ("The data has been fetched")
dataframe = pd.DataFrame(rows, columns =['s_name', 't_tid','b_id', 'name', 'summary'])
dataframe.to_csv('data.csv', index = None)
The data looks like this
s_sname t_tid b_id name summary
---------------------------------------------------------------------------
db1 001 100 careie hello this is john speaking blah blah blah
It looks like above but has 5288 such rows.
When I try to execute my code mentioned above it throws an error saying :
ValueError: Shape of passed values is (5288, 1), indices imply (5288, 5)
I do not understand what wrong I am doing.
Use this.
dataframe = pd.read_sql('Select * FROM classic.dbo.sample3',con=conn)
dataframe.to_csv('data.csv', index = None)
Related
I've been trying to learn how to use sqlite3 for python 3.10 and I can't find any explanation of how I'm supposed to grab saved data From a database and insert it into a variable.
I'm attempting to do that myself in this code but It just prints out
<sqlite3.Cursor object at 0x0000018E3C017AC0>
Anyone know the solution to this?
My code is below
import sqlite3
con = sqlite3.connect('main.db')
cur = con.cursor()
#Create a table called "Datatable" if it does not exist
cur.execute('''CREATE TABLE IF NOT EXISTS datatable
(Name PRIMARY KEY, age, pronouns) ''')
# The key "PRIMARY KEY" after Name disallow's information to be inserted
# Into the table twice in a row.
name = 'TestName'#input("What is your name? : ")
age = 'TestAge'#input("What is your age? : ")
def data_entry():
cur.execute("INSERT INTO datatable (name, age)")
con.commit
name = cur.execute('select name from datatable')
print(name)
Expected result from Print(name) : TestName
Actual result : <sqlite3.Cursor object at 0x00000256A58B7AC0>
The execute statement fills the cur object with data, but then you need to get the data out:
rows = cur.fetchall()
for row in rows:
print(row)
You can read more here: https://www.sqlitetutorial.net/sqlite-python/sqlite-python-select/
I am attempting to use MSSQL JSON AUTO to easily go from a query to a Go Struct. Data returned looks jsony but am having troubles converting it from a string to the known struct I want.
func main() {
type LOBData struct {
COB_ID int `json:"COB_ID"`
GrossLoss float64 `json:"GrossLoss"`
}
type ResultData struct {
YearID int `json:"YearID"`
EventID int `json:"EventID"`
Modelcode int `json:"modelcode"`
Industry float64 `json:"Industry"`
LOB []LOBData `json:"y"`
}
db, err := sql.Open("sqlserver", ConnString())
checkErr(err)
defer db.Close()
var result string
err = db.QueryRow(`
SELECT i.YearID, i.EventID, i.modelcode, totalloss as Industry, y.COB_ID, y.GrossLoss
FROM dbo.CS_IndustryLossv8_7938 AS i INNER JOIN
dbo.Tb_YLT AS y ON i.YearID = y.YearID AND i.EventID = y.EventID AND i.modelcode = y.Modelcode
where YLT_DVID=25
FOR JSON AUTO`).Scan(&result)
fmt.Println(result)
YLT:= ResultData{}
//var YLT []ResultData
err=json.Unmarshal([]byte(result), &YLT)
checkErr(err)
fmt.Println(YLT)
}
fmt.Printlin(result) prints:
[{"YearID":7687,"EventID":101900,"modelcode":41,"Industry":1.176648913256758e+010,"y":[{"COB_ID":5,"GrossLoss":6.729697615695682e+003}]},.....
but fmt.Println(YLT) returns:
{0 0 0 0 []}
I am getting an error of "unexpected end of json input".
While Go does not have a string limit, MSSQL does of 8,000 characters. If I limit my query to top 3 rows and use var YLT []ResultData it works. Anyway of doing this using MSSQL and Go or should I being using different server tech?
I'm not sure why FOR JSON specifically does that, but it's not normally an issue with selecting nvarchar(max) columns.
Another way to get out of the issue is to assign it to a variable first:
DECLARE #j nvarchar(max) =
(
SELECT ...
FROM...
FOR JSON AUTO
);
SELECT #j;
Apologies...found the answer, unlike the grid output, MSSQL splits up result into multiple rows that than needs to be concatenated.
https://learn.microsoft.com/en-us/sql/relational-databases/json/format-query-results-as-json-with-for-json-sql-server?view=sql-server-ver15
Output of the FOR JSON clause
The output of the FOR JSON clause has the following characteristics:
The result set contains a single column.
A small result set may contain a single row.
A large result set splits the long JSON string across multiple rows.
By default, SQL Server Management Studio (SSMS) concatenates the results into a single row when the output setting is Results to Grid. The SSMS status bar displays the actual row count.
Other client applications may require code to recombine lengthy results into a single, valid JSON string by concatenating the contents of multiple rows. For an example of this code in a C# application, see Use FOR JSON output in a C# client app.
Currently my code have simple tables containing the data needed for each object like this:
infantry = {class = "army", type = "human", power = 2}
cavalry = {class = "panzer", type = "motorized", power = 12}
battleship = {class = "navy", type = "motorized", power = 256}
I use the tables names as identifiers in various functions to have their values processed one by one as a function that is simply called to have access to the values.
Now I want to have this data stored in a spreadsheet (csv file) instead that looks something like this:
Name class type power
Infantry army human 2
Cavalry panzer motorized 12
Battleship navy motorized 256
The spreadsheet will not have more than 50 lines and I want to be able to increase columns in the future.
Tried a couple approaches from similar situation I found here but due to lacking skills I failed to access any values from the nested table. I think this is because I don't fully understand how the tables structure are after reading each line from the csv file to the table and therefore fail to print any values at all.
If there is a way to get the name,class,type,power from the table and use that line just as my old simple tables, I would appreciate having a educational example presented. Another approach could be to declare new tables from the csv that behaves exactly like my old simple tables, line by line from the csv file. I don't know if this is doable.
Using Lua 5.1
You can read the csv file in as a string . i will use a multi line string here to represent the csv.
gmatch with pattern [^\n]+ will return each row of the csv.
gmatch with pattern [^,]+ will return the value of each column from our given row.
if more rows or columns are added or if the columns are moved around we will still reliably convert then information as long as the first row has the header information.
The only column that can not move is the first one the Name column if that is moved it will change the key used to store the row in to the table.
Using gmatch and 2 patterns, [^,]+ and [^\n]+, you can separate the string into each row and column of the csv. Comments in the following code:
local csv = [[
Name,class,type,power
Infantry,army,human,2
Cavalry,panzer,motorized,12
Battleship,navy,motorized,256
]]
local items = {} -- Store our values here
local headers = {} --
local first = true
for line in csv:gmatch("[^\n]+") do
if first then -- this is to handle the first line and capture our headers.
local count = 1
for header in line:gmatch("[^,]+") do
headers[count] = header
count = count + 1
end
first = false -- set first to false to switch off the header block
else
local name
local i = 2 -- We start at 2 because we wont be increment for the header
for field in line:gmatch("[^,]+") do
name = name or field -- check if we know the name of our row
if items[name] then -- if the name is already in the items table then this is a field
items[name][headers[i]] = field -- assign our value at the header in the table with the given name.
i = i + 1
else -- if the name is not in the table we create a new index for it
items[name] = {}
end
end
end
end
Here is how you can load a csv using the I/O library:
-- Example of how to load the csv.
path = "some\\path\\to\\file.csv"
local f = assert(io.open(path))
local csv = f:read("*all")
f:close()
Alternative you can use io.lines(path) which would take the place of csv:gmatch("[^\n]+") in the for loop sections as well.
Here is an example of using the resulting table:
-- print table out
print("items = {")
for name, item in pairs(items) do
print(" " .. name .. " = { ")
for field, value in pairs(item) do
print(" " .. field .. " = ".. value .. ",")
end
print(" },")
end
print("}")
The output:
items = {
Infantry = {
type = human,
class = army,
power = 2,
},
Battleship = {
type = motorized,
class = navy,
power = 256,
},
Cavalry = {
type = motorized,
class = panzer,
power = 12,
},
}
Right now I am reading rows from a file and saving in database using the below code:
String strQuery = "INSERT INTO public.alarm (id, name, marks) VALUES (?, ?, ?)";
JDBCOutputFormat jdbcOutput = JDBCOutputFormat.buildJDBCOutputFormat()
.setDrivername("org.postgresql.Driver")
.setDBUrl("jdbc:postgresql://localhost:5432/postgres?user=michel&password=polnareff")
.setQuery(strQuery)
.setSqlTypes(new int[] { Types.INTEGER, Types.VARCHAR, Types.INTEGER}) //set the types
.finish();
DataStream<Row> rows = FilterStream
.map((tuple)-> {
Row row = new Row(3);
row.setField(0, tuple.f0);
row.setField(1, tuple.f1);
row.setField(2, tuple.f2);
return row;
});
rows.writeUsingOutputFormat(jdbcOutput);
env.execute();
}
}
The above is working fine and it picks rows one by one from a file and saves it in the database.
For example:
If the file contains:
1, mark, 20
then database entry will look like:
id name marks
------------------
1 mark 20
Now the requirement is for every row, I have to create 2 different rows and it should look like below:
For example:
If the file contains:
1, mark, 20
then database entry should look like this:
id name marks
------------------
1 mark-1 20
1 mark-2 20
Now I should return List instead of row and datastream variable should look like DataStream<List<Row>> rows.
What should I change in JDBCOutputFormat variable in order to achieve this?
I have a dataframe and would like to rename the columns based on a dictionary with multiple values per key. The dictionary key has the desired column name, and the values hold possible old column names. The column names have no pattern.
import pandas as pd
column_dict = {'a':['col_a','col_1'], 'b':['col_b','col_2'], 'c':'col_c','col_3']}
df = pd.DataFrame([(1,2.,'Hello'), (2,3.,"World")], columns=['col_1', 'col_2', 'col_3'])
Function to replace text with key
def replace_names(text, dict):
for key in dict:
text = text.replace(dict[key],key)
return text
replace_names(df.columns.values,column_dict)
Gives an error when called on column names
AttributeError: 'numpy.ndarray' object has no attribute 'replace'
Is there another way to do this?
You can use df.rename(columns=...) if you supply a dict which maps old column names to new column names:
import pandas as pd
column_dict = {'a':['col_a','col_1'], 'b':['col_b','col_2'], 'c':['col_c','col_3']}
df = pd.DataFrame([(1,2.,'Hello'), (2,3.,"World")], columns=['col_1', 'col_2', 'col_3'])
col_map = {col:key for key, cols in column_dict.items() for col in cols}
df = df.rename(columns=col_map)
yields
a b c
0 1 2.0 Hello
1 2 3.0 World