No other way than dropping the duplicates, if ValueError: Index contains duplicate entries, cannot reshape? - dataset

enter image description here
Hi everyone, this is my first question.
I'm working on a dataset from patients who undergone urine analysis.
Every row refer to a single Patient Id and every Request ID could refer to different types of urine analysis (aspect, colour, number of erythrocytes, bacteria and go on).
I've add an image to let you understand my dataset.
I'd like to reshape making one request = one row , with all the tests done in the same request on the same row.
After that I want to merge with another df, that I reshape by Request ID (cause the first was missing a "long result" column, that I downloaded from another software in use in our Hospital).
I've tried:
df_pivot = df.pivot(index='Id Richiesta', columns = 'Nome Analisi Elementare', values = 'Risultato')
df_pivot.reset_index(inplace=True)
After I want to do --> df_merge = pd.merge (df_pivot,df,how='left', on='Id Richiesta')
I've tried once with another dataset, but I had to drop_duplicates for other purpose, and it worked.
But this time I have to analyse all the features.
How can I do? Is there no other way than dropping the duplicates?
Thank you for any help! :)

I've studied more my data and discovered 1 duplicate of bacteria for the same id request (1 in almost 8 million entries....)
df.drop_duplicates[df[['Id Richiesta', 'Id Analisi Elementare', 'Risultato']].duplicated()]
Then visualized all the rows referring at the "Id Richiesta" and the keep last (they were the same).
Thank you and sorry.
Please, tell me if I had to delete this question.

Related

Matching and replacing a selection of data from two different dataframes

(First time posting so please bear with) I have two different dataframes, one of which contains a column of replacement data for a selection of data within the first dataframe.
#dataframe 1
df<-data.frame(site= rep(1:4,3), landings = rep("val",12),
harbour = c("a","b","c","d","e","f","g","h","i","j","k","l"))
#dataframe 2
new_site4<-data.frame(harbour = c("a","b","c","d","e","f","g","h","i","j","k","l"),
sub_site = c("x","x","y","x","y","y","y","x","y","x","y","y") )
I want to replace the "site" in dataframe 1 with the "subsite" in dataframe 2 based on the match of "harbour" however I only need to do it for records for site "4".
Is there a neat way to select only site 4 and then replace the site number with the subsite, ideally without merging or without creating a whole new dataframe. My real dataset is large but the key is only small as it only refers to a small selection of the data which needs the subsite added.
I tried using match() on my main dataset but for some reason it only matched some of the required data not all of it, but this code wont work on my sample data either.
#df$site[match(df$harbour, new_site4$harbour)] <- new_site4$sub_site[match(df$harbour, df$harbour)]`

Laravel skip and delete records from Database

I'm developing an app which needs to record a list of a users recent video uploads. Importantly it needs to only remember the last two videos associated with the user so I'm trying to find a way to just keep the last two records in a database.
What I've got so far is the below, which creates a new record correctly, however I then want to delete all records that are older than the previous 2, so I've got the below.
The problem is that this seems to delete ALL records even though, by my understanding, the skip should miss out the two most recent records,
private function saveVideoToUserProfile($userId, $thumb ...)
{
RecentVideos::create([
'user_id'=>$userId,
'thumbnail'=>$thumb,
...
]);
RecentVideos::select('id')->where('user_id', $userId)->orderBy('created_at')->skip(2)->delete();
}
Can anyone see what I'm doing wrong?
Limit and offset do not work with delete, so you can do something like this:
$ids = RecentVideos::select('id')->where('user_id', $userId)->orderByDesc('created_at')->skip(2)->take(10000)->pluck('id');
RecentVideos::whereIn('id', $ids)->delete();
First off, skip() does not skip the x number of recent records, but rather the x number of records from the beginning of the result set. So in order to get your desired result, you need to sort the data in the correct order. orderBy() defaults to ordering ascending, but it accepts a second direction argument. Try orderBy('created_at', 'DESC'). (See the docs on orderBy().)
This is how I would recommend writing the query.
RecentVideos::where('user_id', $userId)->orderBy('created_at', 'DESC')->skip(2)->delete();

Show UniData SELECT results that are not record keys

I'm looking over some UniData fields for distinct values but I'm hoping to find a simpler way of doing it. The values aren't keys to anything so right now I'm selecting the records I'm interested in and selecting the data I need with SAVING UNIQUE. The problem is, in order to see what I have all I know to do is save it out to a savedlist and then read through the savedlist file I created.
Is there a way to see the contents of a select without running it against a file?
If you are just wanted to visually look over the data, use LIST instead of SELECT.
The general syntax of the command is something like:
LIST filename WITH [criteria] [sort] [attributes | ALL]
So let's say you have a table called questions and want to look over all the author for questions that used the tag unidata. Your query might look something like:
LIST questions WITH tag = "unidata" BY author author
Note: The second author isn't a mistake, it's the start of the list of attributes you want displayed - in this case just author, but you might want the record id as well, so you could do #ID author instead. Or just do ALL to display everything in each record.
I did BY author here as it will make spotting uniques easier, but you can also use other query features like BREAK.ON to help here as well.
I don't know why I didn't think of it at the time but I basically needed something like SQL's DISTINCT statement since I just needed to view the unique values. Replicating DISTINCT in UniData is explained here, https://forum.precisonline.com/index.php?topic=318.0.
The trick is to sort on the values using BY, get a single unique value of each using BREAK-ON, and then suppress everything except those unique values using DET-SUP.
LIST BUILDINGS BY CITY BREAK-ON CITY DET-SUP
CITY.............
Albuquerque
Arlington
Ashland
Clinton
Franklin
Greenville
Madison
Milton
Springfield
Washington

Laravel show records as flat array or single record

I have 2 column in my table setting
with the following values
KEY VALUE
company ABC
phone 14344
address Somerset City
I need to display this like a single record or a flatten
array in the view/blade page
something like
{{$sett->company}}
{{$sett->phone}}
or an array with lookup
{{$myarray('company')}}
{{$myarray('phone')}}
The idea is if I add another settings like contact us email address
for my website I don't want to add another column.
I know this is achievable in controller by creating different variable
and executing different query but I'm kind of looking for some options here.
Thanks for the help really appreciated.
You can use $settings->pluck('value', 'key') to get your result. Read more here: https://laravel.com/docs/5.4/collections#method-pluck

Autocomplete Dropdown - too much data, timing out

So, I have an autocomplete dropdown with a list of townships. Initially I just had the 20 or so that we had in the database... but recently, we have noticed that some of our data lies in other counties... even other states. So, the answer to that was buy one of those databases with all towns in the US (yes, I know, geocoding is the answer but due to time constraints we are doing this until we have time for that feature).
So, when we had 20-25 towns the autocomplete worked stellarly... now that there are 80,000 it's not as easy.
As I type I am thinking that the best way to do this is default to this state, then there will be much less. I will add a state selector to the page that defaults to NJ then you can pick another state if need be, this will narrow down the list to < 1000. Though, I may have the same issue? Does anyone know of a work around for an autocomplete with a lot of data?
should I post teh codez of my webservice?
Are you trying to autocomplete after only 1 character is typed? Maybe wait until 2 or more...?
Also, can you just return the top 10 rows, or something?
Sounds like your application is suffocating on the amount of data being returned, and then attempted to be rendered by the browser.
I assume that your database has the proper indexes, and you don't have a performance problem there.
I would limit the results of your service to no more than say 100 results. Users will not look at any more than that any how.
I would also only being retrieving the data from the service once 2 or 3 characters are entered which will further reduce the scope of the query.
Good Luck!
Stupid question maybe, but... have you checked to make sure you have an index on the town name column? I wouldn't think 80K names should be stressing your database...
I think you're on the right track. Use a series of cascading inputs, State -> County -> Township where each succeeding one grabs the potential population based on the value of the preceding one. Each input would validate against its potential population to avoid spurious inputs. I would suggest caching the intermediate results and querying against them for the autocomplete instead of going all the way back to the database each time.
If you have control of the underlying SQL, you may want to try several "UNION" queries instead of one query with several "OR like" lines in its where clause.
Check out this article on optimizing SQL.
I'd just limit the SQL query with a TOP clause. I also like using a "less than" instead of a like:
select top 10 name from cities where #partialname < name order by name;
that "Ce" will give you "Cedar Grove" and "Cedar Knolls" but also "Chatham" & "Cherry Hill" so you always get ten.
In LINQ:
var q = (from c in db.Cities
where partialname < c.Name
orderby c.Name
select c.Name).Take(10);

Resources