File name too long - dataset

In a local repository, I have several json files. When I run the command
from datasets import load_dataset
dataset = load_dataset('json', data_files=['./100009.json'])
I got the following error:
OSError: [Errno 36] File name too long: '/home/infinity/.cache/huggingface/datasets/_home_infinity_.cache_huggingface_datasets_json_default-80a93068b3a4a494_0.0.0_83d5b3a2f62630efc6b5315f00f20209b4ad91a00ac586597caee3a4da0bef02.lock'
Maybe it is obvious, but I am not sure how to solve it. Can you help?
EDIT
Here is the content of the json file :
{
"id": "68af48116a252820a1e103727003d1087cb21a32",
"article": [
"by mark duell .",
"published : .",
"05:58 est , 10 september 2012 .",
"| .",
"updated : .",
"07:38 est , 10 september 2012 .",
"a pet owner starved her two dogs so badly that one was forced to eat part of his mother 's dead body in a desperate attempt to survive .",
"the mother died a ` horrendous ' death and both were in a terrible state when found after two weeks of starvation earlier this year at the home of katrina plumridge , 31 , in grimsby , lincolnshire .",
"the barely-alive dog was ` shockingly thin ' and the house had a ` nauseating and overpowering ' stench , grimsby magistrates court heard .",
"warning : graphic content .",
"horrendous : the male dog , scrappy -lrb- right -rrb- , was so badly emaciated that he ate the body of his mother ronnie -lrb- centre -rrb- to try to survive at the home of katrina plumridge in grimsby , lincolnshire .",
"the suffering was so serious that the female staffordshire bull terrier , named ronnie , died of starvation , nigel burn , prosecuting , told the court last friday .",
"suspended jail term : the dogs were in a terrible state when found after two weeks of starvation at the home of katrina plumridge , 31 -lrb- pictured -rrb- .",
"the male dog , her son scrappy , was so badly emaciated that he ate her body to try to survive .",
"` the degree of suffering caused to both dogs was extreme and prolonged , ' mr burn said . ` it was as severe and extreme as it can get . '",
"the alarm was raised when a letting agent visited her home and saw dog mess on the steps , stairs , an upstairs floor and a bed .",
"a painfully thin dog jumped past him . he said its ribs , spine and hip bones could all be seen and it was the thinnest dog he had ever witnessed .",
"he tried to go into the kitchen but it was blocked from the inside by the dead body of the mother dog . the letting agent then called the royal society for the prevention of cruelty to animals .",
"mr burn said : ` every single bone in its frame was visible and the stomach was curved in . the empty dog bowls were bone dry . '",
"a decorator who went into the house said the stench made him feel physically sick , ronnie was like a skeleton and scrappy was ` shockingly thin ' .",
"a veterinary surgeon estimated that the dogs would have been suffering from starvation for at least two weeks .",
"plumridge moved out of the house on march 28 but the dogs were n't found until april 19 . she had claimed a friend was supposed to be finding new homes for the dogs and left them without going back to check on them .",
],
"abstract": [
"neglect by katrina plumridge saw staffordshire bull terrier ronnie die .",
"dog 's son scrappy was forced to eat her to survive at grimsby house .",
"alarm raised by letting agent shocked by ` thinnest dog he 'd ever seen '",
]

When you working on large datasets, it is appropriate to use a pandas dataframe.
import pandas as pd
df= pd.read_json(r'Path where you saved the JSON file\File Name.json')
print (df)

This looks to be a bug in the huggingface library. It's attempting to read or write a filename that's too long for the underlying file system (likely ext4 in the case of Ubuntu). I opened an issue here.

Related

Solved • Swift Widgets • Changing text from random element in array •

How I solved this question (A) Huge thanks to lorem ipsum for this documentation: https://developer.apple.com/news/?id=yv6so7ie
(B) In this example I wanted to list a series of facts, have my widget pick one each hour and change the text on said widget.
This code was sourced from another company or organisation.
A huge thanks to lorem i-sum for the documentation: https://developer.apple.com/news/?id=yv6so7ie
In this example I've defined FactList as an array of facts about dogs. I'd like to make the widget text be one of those facts. This text value should then (ideally) change to another fact every minute.
Here's how to solve this question ➜
//Create an additional Swift File (Not swift UI) and link it to your existing Widget kit Name.
//In this file I used this code allowing an array to define all possible facts and these facts to refresh each hour and change the text on the widget.
'import SwiftUI
let currentDate = Date()
//Fetches the time in hours from the calendar
let hour = Calendar.current.component(.hour, from: currentDate)
let chosenFact = FactProvider.getFact(for:hour)
//Defines all facts as an array
struct FactProvider {
static let facts = [
"33% of dog owners leave their dogs messages on their answering machine when they aren't home",
"Dogs have sweat glands at the bottom of their paws. In a hot summer day you can wet them and your friend will stay cool",
"A dog’s nose is always wet so they can absorb certain scents. Dogs will lick the nose to smell the scents",
"A dog’s urine is so acidic it can corrode metal. Keep valuable things away from your dog - they may be destroyed",
"Dogs really can sense the earth’s magnetic field. It’s a huge reason for why they’re so effective at finding how to get home", "A dog’s nose is always wet so they can absorb certain scents. Dogs will lick the nose to smell the scents", "A dog’s nose is so unique it can be used to identify a single dog of all others, much like a fingerprint", "About 31% of all dogs can snore in their sleep, compared to the estimated 45% of humans", "According to a 2017 survey, the most popular dog name is Max for a boy, and Bella for a girl", "Every single dog breed is a descendant of a wolf", "When dogs poop they align with the earth’s magnetic field: with the north-south axis to be specific", "For every eight people on earth, there is one dog. However 2/3s of these dogs are stray dogs", "Most dog’s paws smell like corn chips. This happens because of the build-up sweat and bacteria on the dog’s paws", "The countries with the most number of dogs are, in order, the USA, followed by Brazil with China in third place", "In WW2 Russians trained dogs to go on suicide missions, strapping bombs on their backs, having them run intro the front lines of enemy troops. Poor dogs", "A dog is able to locate the source of a sound in a staggering 6/100th of a second", "Dogs can understand roughly 250 words or gestures", "A dog’s normal temperature is the same as the “fever temperature” for humans", "Dogs have 2x as many ear muscles as humans and can hear 4x better as well!", "A study from the Uni of Vienna shows dogs, in general, prefer men versus women. Especially with anxious men", "Wagging a dog’s tail to the right means they are happy, and to the left when scared, and low when insecure", "Dogs curl up due to inherent instincts to protect their vital organs and keep warm while asleep", "Dogs may feel jealous, but those “guilty-looking” puppy eyes aren’t because of guilt. Dogs can’t feel guilt", "It’s legal in Ohio for police to bite a crazed barking dog to calm them down. Officer VS Dog: who do you think will win?", "A 2015 survey reveals that 45% of US Dogs sleep in their owner’s bed", "Dogs can tell when we’re lying. Dogs can actually learn not to trust unreliable people", "Dogs don’t actually hate cats. They just think that since cats are small and speedy they are predators. Dogs and cats can, however, get along quite well from time to time", "Dogs sniff each other’s butts to learn about each other "]
// For each hour ('case') the system selects the facts[number of next fact] from the array
static func getFact(for hour: Int) -> String {
switch hour {
case 1:
return facts[0]
case 2:
return facts[1]
case 3:
return facts[2]
case 4:
return facts[3]
case 5:
return facts[4]
case 6:
return facts[5]
case 7:
return facts[6]
case 8:
return facts[7]
case 9:
return facts[8]
case 10:
return facts[9]
case 11:
return facts[10]
case 12:
return facts[11]
case 13:
return facts[12]
case 14:
return facts[13]
case 15:
return facts[14]
case 16:
return facts[15]
case 17:
return facts[16]
case 18:
return facts[17]
case 19:
return facts[18]
case 20:
return facts[19]
case 21:
return facts[20]
case 22:
return facts[21]
case 23:
return facts[22]
case 24:
return facts[23]
case 25:
return facts[24]
// 25 is chosen as there are only 24 hours in a day starting from 0. Fact 24 as arrays count first element as element [0]
default:
return "We seem to have run into an error. Please hold whilst we attempt to sync your next fact..."
}
}
}'
While you're welcome to use the code please ensure you change the facts as not to make derivative copies or direct rip-offs of the facts.
Please note that is against the law to edit or make derivative copies of the facts present in this code. The copyright holders of this code will actively and aggressively maintain protection of their content (the facts themselves) to the full extent of the law
Happy Coding!

Loading an Array tag from a .JSON file into snowflake

I have a json file that I am loading into snowflake. One of the keys in the file has a value that is an array. The question is how do I load this tag into a separate column of type ARRAY in snowflake? It's already an array in json. Do I still need to use an array_construct(tag_name_here) function to load it up? What happens if in subsequent records, the 'industry' tag is missing altogether? Please advise.
Below is a sample of the json...
[
[
{
"title": "Avino Silver & Gold Mines Ltd. Fourth Quarter and Year End Results to be Released on....",
"pubDate": "Tue, 25 Feb 2020 00:49:00 +0000",
"description": " Avino Silver & Gold Mines Ltd. plans to announce its Fourth Quarter and Year End 2019 financial results after the market closes. In addition, the Company...",
"industry": [
"Mining & Metals ",
"Mining ",
"MNG",
"MIN"
],
"subject": [
"Conference Call Announcements ",
"Earnings "
]
}
]
]
Take a look at the examples here:
https://docs.snowflake.net/manuals/user-guide/querying-semistructured.html
generally you're looking into extracting data rather than constructing it (just use value:industry to fill in your array column). And if the proper tag is missing in some record it will just get filled with NULL.

POS dynamic item capital

I've try to search this, but I haven't found it yet.
I have a case of payment online system.
Let's say I have an item named "A"
"A" stocks is 30
but, from the "A" stocks..
I buy 20 of "A" for $10 each from supplier
and the rest 10 of "A", I buy $15 each from supplier
and then I sell "A" $20 each
Now, How do I design the item table in the database for "A" so I able to count
the exact profit for "A"
am I have to input 2 row in "item" table which contain
"A.1" for $10 of capital and
"A.2" for $15 of capital
If that's the conclusion, that means..
When a customer buy "A" 25 pieces.
Then I have to decrease A.1 stock to 0
and A.2 to 5
Are you guys have a better solution from me ?
Sorry for my bad English and knowledge.

Combined Select Statement

I need to select from a database several values with different combinations possibilities for checking an event and do the correct action to each:
Example Event: "Man wears a 1930 green Italian hat."
Need to get all values in a database like this:
object -- color -- country -- year
hat -- any -- any -- any
hat -- green -- any -- any
hat -- green -- Italian -- any
(...)
hat -- any -- Italian -- any
hat -- any -- Italian -- 1930
In a way I can check the actions bound to:
Wearing a green, Italian, 1930 hat
Wearing a green, hat
Wearing a hat
(And so on for all the possibilities that exists)
This way I could start all the procedures for a man wearing a hat and the specific procedures for a man wearing a green italian hat, for example.
What would be the most efficient way to do this?
What kind of database are you talking about?
Is this a SQL database?
Do you have a table for colors, one for countries, ... ?
What exactly are you trying to do? Create values from that example event or create all possible events from the values?
Since you tagged this with Lua, I assume those "properties" are in tables as in:
local colors = {"green","red"}
local countries = {"Italian","Mexican"}
local years = {1930,2016}
I'm guessing you want to create all possible events from those properties:
for _,color in pairs(colors) do
for _,country in pairs(countries) do
for _,year in pairs(years) do
print("Wearing a "..table.concat({color,country,year},", ").." hat")
end
end
end
You can run this code on the Lua demo page: http://www.lua.org/cgi-bin/demo
The result is that every possible combination is outputted. The only thing that might not be right is the implementation of your "any".
If you want to have combinations with no color/country/hat, you could add "" to the lists. It would probably be better to use a numeric for loop and adding an extra iteration, as in:
for i=1,#colors+1 do
--etc
end

MapReduce: How to join 2 tables: R(a,b) x S(c,d) where b<c

Given in each record, you have
Table_name(R/S) | attribute_1(a/c) | attribute_2(b/d)
.
.
.
For example this can be content of the input file:
R|$a_1$|$b_1$
R|$a_2$|$b_2$
S|$c_1$|$d_1$
R|$a_3$|$b_3$
S|$c_2$|$d_2$
An output is lines of:
$a_i$|$b_i$|$c_j$|$d_j$
,where $b_i < c_j$
(This is an exercise in the book "Mining of Massive Datasets". It is on the page 22 of this link: http://infolab.stanford.edu/~ullman/mmds/ch2.pdf (exercise 2.3.5). the book is freely available)
I've spend half a day and look over internet and still have no clue how to solve it...

Resources