I have the following code to write my outputs from SQL Server using Python: output is '1', '2', '3', but I want 1 2 3 - sql-server

conn = pyodbc.connect('Driver={SQL Server};'
'Server=DESKTOP-IINBCRC\SQLEXPRESS;'
'Database=employees;'
'Trusted_Connection=yes;')
cur = conn.cursor()
cur.execute("select * from Login")
for row in cur:
print(row)
The output is '1', '2', '3', but I want the output to be 1 2 3

So, given,
txt = "'1', '2', '3'"
lst=[x.strip()[1:-1] for x in txt.split(",")]
" ".join(lst)
this should produce (I have tested)
'1 2 3'
Or to do it a bit more smart:
txt=txt.replace(",","")
txt=txt.replace("'","")

Related

Looping in R with tidyverse

I am trying to read multiple csv files with date in the format dd-mm-yyyy. I want to convert the months into seasons, for which I have used the following codes (for one csv file):
x= data %>%
dplyr::mutate(year = lubridate::year(UDATE),
month = lubridate::month(UDATE),
day = lubridate::day(UDATE))
x %>%
mutate(season = case_when(
month %in% c('3', '4', '5', '6') ~ 'Summer',
month %in% c('7', '8', '9', '10') ~ 'Monsoon',
month %in% c('11', '12', '1', '2') ~ 'Winter'
))
Now I want to run this for the multiple csv files simultaneously and export those files with the converted data frames such that my month is converted into seasons.
Can someone please suggest me how to put that in a loop function for multiple csv files simultaneously.
Thank you
Without data to reproduce it is a little hard to help, so I wrote some generic code that might need a few tweaks to solve your problem.
First, I would create a single function that import, transform and export the data.
prep_data <-
function(file_name){
data <-
read.csv2(file_name) %>%
dplyr::mutate(
year = lubridate::year(UDATE),
month = lubridate::month(UDATE),
day = lubridate::day(UDATE),
season = case_when(
month %in% c('3', '4', '5', '6') ~ 'Summer',
month %in% c('7', '8', '9', '10') ~ 'Monsoon',
month %in% c('11', '12', '1', '2') ~ 'Winter'
)
#I put "_new" as prefix to not overwrite the original .csv
write.csv2(file = paste0("new_",file_name))
}
Then I would create a vector with all my files names.
all_files <- list.files(pattern = "*.csv",path = "your path with the csv files",full.names = TRUE)
Lastly, apply the function for all files.
purrr::map(.x = all_files,.f = prep_data)

P-values for glmer mixed effects logistic regression in Python

I have a dataset for one year for all employees with individual-level data (e.g. age, gender, promotions, etc.). Each employee is in a team of a certain manager. I have some variables on the team- and manager-levels as well (e.g. manager's tenure, team diversity, etc.). I want to explain the termination of employees (binary: left the company or not). I am running a multilevel logistic regression, where employees are grouped by their managers, therefore they share the same team- and manager-level characteristics.
So, my model looks like:
Termination ~ Age + Time in company + Promotions + Manager tenure + Percent of employees who completed training", data, groups=data[Manager_ID]
Dataset example:
data = {'Employee': ['ID1', 'ID2','ID3','ID4','ID5','ID6','ID7', 'ID8'],
'Manager_ID': ['MID1', 'MID2','MID2','MID1','MID3','MID3','MID3', 'MID1'],
'Termination': ['0', '0', '0', '0', '1', '1', '1', '0'],
'Age': ['35', '40','50','24','33','46','44', '31'],
'TimeinCompany': ['1', '3', '10', '20', '4', '0', '4', '9'],
'Promotions': ['1', '0', '0', '0', '1', '1', '1', '0'],
'Manager_Tenure': ['10', '5', '5', '10', '8', '8', '8', '10'],
'PercentCompletedTrainingTeam': ['40', '20', '20', '40', '49', '49', '49', '40']}
columns = ['Employee','Manager_ID','Age', 'TimeinCompany', 'Promotions', 'Manager_Tenure', 'AverageAgeTeam', 'PercentCompletedTrainingTeam']
data = pd.DataFrame(data, columns=columns)
I managed to run mixed effects logistic regression using lme4 package from R in Python.
importr('lme4')
model1 = r.glmer(formula=Formula('Termination ~ Age + TimeinCompany + Promotions + Manager_Tenure + PercentCompletedTrainingTeam + (1 | Manager_ID)'),
data=data)
print(r.summary(model1))
I receive the following output for the full sample:
REML criterion at convergence: 54867.6
Scaled residuals:
Min 1Q Median 3Q Max
-2.9075 -0.3502 -0.2172 -0.0929 3.9378
Random effects:
Groups Name Variance Std.Dev.
Manager_ID (Intercept) 0.005033 0.07094
Residual 0.072541 0.26933
Number of obs: 211974, groups: Manager_ID, 24316
Fixed effects:
Estimate Std. Error t value
(Intercept) 0.14635573 0.00299341 48.893
Age -0.00112153 0.00008079 -13.882
TimeinCompany -0.00238352 0.00010314 -23.110
Promotions -0.01754085 0.00491545 -3.569
Manager_Tenure -0.00044373 0.00010834 -4.096
PercentCompletedTrainingTeam -0.00014393 0.00002598 -5.540
Correlation of Fixed Effects:
(Intr) Age TmnCmpny Promotions Mngr_Tenure
Age -0.817
TmnCmpny 0.370 -0.616
Promotions -0.011 -0.009 -0.033
Mngr_Tenure -0.279 0.013 -0.076 0.035
PrcntCmpltT -0.309 -0.077 -0.021 -0.042 0.052
But, there are no p-values displayed. I read a lot that lme4 does not provide p-values for a number of reasons, however I have to have them for the work presentation.
I tried several possible solutions that I found, but none of them worked:
importr('lmerTest')
importr('afex')
print(r.anova(model1))
does not display any output
print(r.anova(model1, ddf="Kenward-Roger"))
only displays npar, Sum Sq, Mean Sq, F value
print(r.summary(model1, ddf="merModLmerTest"))
Provides the same output as with just summary
print(r.anova(model1, "merModLmerTest"))
only displays npar, Sum Sq, Mean Sq, F value
Any ideas on how to get p-values are much appreciated.

Snowflake EXPLAIN query support with Snowflake JDBC Driver

Is there a way to run an EXPLAIN snowflake query through the JDBC driver with the Snowflake extension? I am running net.snowflake snowflake-jdbc 3.12.8 and it throws an error saying net.snowflake.client.jdbc.SnowflakeSQLException: SQL compilation error: syntax error line 1 at position 15 unexpected 'EXPLAIN'.. I see there are more up to date versions to 3.12.16 but nothing in the release notes mentions this added capability. The same exact query I am running works successfully in the snowflake UI.
I had no problem using EXPLAIN and the Snowflake JDBC driver 3.12.8:
print(sc._jvm.net.snowflake.spark.snowflake.Utils.getClientInfoString())
x=sc._jvm.net.snowflake.spark.snowflake.Utils.runQuery(sfOptions, 'explain select * from numbers limit 10')
cols = x.getMetaData().getColumnNames()
print(cols)
while(x.next()):
print([x.getString(i) for i in range(1, 1+cols.size())])
The results show that I'm using the specified JDBC version (through PySpark) and the results of the EXPLAIN query:
{
"spark.version" : "2.4.4",
"spark.snowflakedb.version" : "2.8.1",
"spark.app.name" : "Simple App",
"scala.version" : "2.11.12",
"java.version" : "1.8.0_242",
"snowflakedb.jdbc.version" : "3.12.8"
}
['step', 'id', 'parent', 'operation', 'objects', 'alias', 'expressions', 'partitionsTotal', 'partitionsAssigned', 'bytesAssigned']
[None, None, None, 'GlobalStats', None, None, None, '1', '1', '512']
['1', '0', None, 'Result', None, None, 'NUMBERS.X', None, None, None]
['1', '1', '0', 'Limit', None, None, 'rowCount: 10', None, None, None]
['1', '2', '1', 'TableScan', 'TEMP.PUBLIC.NUMBERS', None, 'X', '1', '1', '512']
For further community debugging, you'll need to paste your code to check what's happening.
The explain query can be executed via Snowflake JDBC connector :
Example:
ResultSet rs = stmt.executeQuery("explain SELECT top 5 * from SNOWFLAKE_SAMPLE_DATA.TPCH_SF10.ORDERS where O_ORDERDATE between '1992-01-01' and '1992-12-31'");
ResultSetMetaData rsmd = rs.getMetaData();
int numberOfColumns = rsmd.getColumnCount();
for (int i = 1; i <= numberOfColumns; i++ ) {
String name = rsmd.getColumnName(i);
System.out.println("name :" + name +" size :" + rsmd.getColumnDisplaySize(i) );
}
O/p:
name :id size :10
name :parent size :10
name :operation size :16777216
name :objects size :16777216
name :alias size :16777216
name :expressions size :16777216
name :partitionsTotal size :39
name :partitionsAssigned size :39
name :bytesAssigned size :39
Thanks,
Sujan Ghosh

CakePHP count() return wrong results?

CakePHP 3.6.14
This code reproduce wrong number:
$where = [
'Postings.source' => $source,
'Postings.approved' => 1,
'Postings.deleted' => 0,
'Postings.disabled' => 0
];
if ($source !== null) {
$where['Postings.created >='] = '(NOW() - INTERVAL 3 DAY)';
}
$count = $this->Postings
->find()
->where($where)
->count();
debug($count); exit;
// 77568 total of all records
########## DEBUG ##########
[
'Postings.source' => 'xzy',
'Postings.approved' => (int) 1,
'Postings.deleted' => (int) 0,
'Postings.disabled' => (int) 0,
'Postings.created >=' => '(NOW() - INTERVAL 3 DAY)'
]
//SQL produced by this query:
SELECT (COUNT(*)) AS `count`
FROM postings Postings
WHERE (
Postings.source = 'xzy'
AND Postings.approved = 1
AND Postings.deleted = 0
AND Postings.disabled = 0
AND Postings.created >= '(NOW() - INTERVAL 3 DAY)' // <<<< with quotes
)
but raw sql query:
SELECT COUNT(*) as `count
FROM `postings`
WHERE `source` = 'xzy'
AND `approved` = 1
AND `deleted` = 0
AND `disabled` = 0
AND `created` >= (NOW() - INTERVAL 3 DAY) // <<< without quotes
// return correct num 2119
How to fix?
Values on the right hand side of a key => value condition are always subject to binding/casting/escaping, unless it's an expression object. Look at the generated query, your SQL snippet will end up as a string literal, ie:
created >= '(NOW() - INTERVAL 3 DAY)'
Long story short, use an expression, either a raw one:
$where['Postings.created >='] = $this->Postings->query()->newExpr('NOW() - INTERVAL 3 DAY');
or use the functions builder:
$builder = $this->Postings->query()->func();
$where['Postings.created >='] = $builder->dateAdd($builder->now(), -3, 'DAY');
See also
Cookbook > Database Access & ORM > Query Builder > Advanced Conditions
Cookbook > Database Access & ORM > Query Builder > Using SQL Functions
You should use Query builder and add to select method count function.
Everything is described here: https://book.cakephp.org/3.0/en/orm/query-builder.html#using-sql-functions

Matlab with database and edit box

i programing license plate recognition programs, and appeared problem. So after few examples i figure out my code
function pushbutton1_Callback(hObject, eventdata, handles)
conn = database('baze', 'root', 'root', 'Vendor', 'MYSQL', 'Server', 'localhost', 'PortNumber', 3306);
a = get(handles.edit8,'String');
if iscell(a) && numel(a) == 1
a = a{1};
end
if ~ischar(a) || isempty(a);
error('A valid string must be supplied!');
end
sqlquery = ['select vardas, pavarde, laipsnis, pareigos, telefonas, marke, numeris, tarnyba from info '...
'where numeris = ' '''' a ''''];
curs = exec(conn, sqlquery);
curs = fetch(curs);
curs.data
close(curs)
close(conn)
All this i got in Matlab command window, but if i have gui aplication and i want retrive value in gui wich have edit text box. Example: after i input car plate number got 'vardas' in edit1 and 'pavarde' in edit2.

Resources