Graph/Gremlin query for social media use case - graph-databases

Mine is a social network kind scenario. I want to get all the posts 'posted' by the people I follow. For each of these posts I want to know whether I have liked it or not and also the no of likes and comments that post have(only count) and latest 3 comments with all properties and all the properties of the commented user like his name etc. What is the best solution to get this in gremlin (possibly avoiding duplication)?
g.addV('user').property('id',1).as('1').
addV('user').property('id',2).as('2').
addV('user').property('id',3).as('3').
addV('user').property('id',4).as('4').
addV('post').property('postId','post1').as('p1').
addV('post').property('postId','post2').as('p2').
addV('comment').property('id','c1').property('text','hi').as('c1').
addV('comment').property('id','c2').property('text','nice').as('c2').
addV('comment').property('id','c3').property('text','hello').as('c3').
addE('follow').from('1').to('2').
addE('follow').from('1').to('3').
addE('follow').from('1').to('4').
addE('posted').from('2').to('p1').
addE('posted').from('2').to('p2').
addE('liked').from('1').to('p2').
addE('liked').from('3').to('p2').
addE('liked').from('4').to('p2').
addE('commented').from('1').to('c1').
addE('comments').from('c1').to('p1').
addE('commented').from('2').to('c2').
addE('comments').from('c2').to('p2').iterate()

The commented edges should have a timestamp property, that's why the following query still has the todo in it, but I guess it should be easy to figure the remaining part out yourself.
g.V().has('user','id',1).as('me').
out('follow').as('friend').
out('posted').as('post'). /* all the posts 'posted' by the people I follow */
project('friend','post','liked','likes','comments','latest').
by(select('friend')).
by(select('post').by('postId')).
by(coalesce(__.in('liked').where(eq('me')).constant('yes'),
constant('no'))). /* whether I have liked it or not */
by(inE('liked').count()). /* no of likes */
by(inE('comments').count()). /* comments that post have(only count) */
by(__.in('comments').as('comment'). /* todo: order by time desc */
in('commented').as('user').limit(3). /* latest 3 comments */
select('comment','user').
by(valueMap()). /* with all properties */
fold())
Result for the sample graph:
==>[friend:v[2],post:post1,liked:no,likes:0,comments:1,latest:[[comment:[id:[c1],text:[hi]],user:[id:[1]]]]]
==>[friend:v[2],post:post2,liked:yes,likes:3,comments:1,latest:[[comment:[id:[c2],text:[nice]],user:[id:[2]]]]]

Related

cloud firestore role based access example: can user create a comment?

While reading the cloud firestore role based access example https://firebase.google.com/docs/firestore/solutions/role-based-access#rules, step 4, I find it not clear whether the user can create a comment or not.
According to the above link, the comment is owned by a user, here is the data model:
/stories/{storyid}/comments/{commentid}
{
user: "alice",
content: "I think this is a great story!"
}
And the rules:
match /comments/{comment} {
allow read: if isOneOfRoles(get(/databases/$(database)/documents/stories/$(story)),
['owner', 'writer', 'commenter', 'reader']);
// Owners, writers, and commenters can create comments. The
// user id in the comment document must match the requesting
// user's id.
//
// Note: we have to use get() here to retrieve the story
// document so that we can check the user's role.
allow create: if isOneOfRoles(get(/databases/$(database)/documents/stories/$(story)),
['owner', 'writer', 'commenter'])
&& request.resource.data.user == request.auth.uid;
}
Note the last line of the rules, to create the comment, the authenticated user (request.auth.uid) has to be the user who is the owner of the comment. However, before even create this comment, how can this user property exist? Maybe, when create the comment, do not require the last segment of the rule "&& request.resource.data.user == request.auth.uid". But when update the comment, can add this rule.
Did I miss anything? Btw, do they actually test examples before using them for online reference? It is also a pity that there is no timestamp when these online documents are created. I was told nowadays things two years old can be obsolete.
The request.resource variable contains the document as it will exist after this operation is completed (assuming it is allowed). So request.resource.data.user is the value of the user field that the operation is trying to write, not the value as it currently exists (that'd be resource.data.user, without request.).

How to filter multiple line of text field using Sharepoint Rest API?

Hi I'm having trouble in filtering SharePoint documents through the use of Rest API since I'm using multiple lines of texts(Plain text) column to filter them out. It only returns null result after trying it out.
Single line of text column seems to work well but I need Multiple lines of text because the metadata exceeds the 255 char limit.
I'm new to SharePoint, please help. Thank you
I am using the API to pull from a multiple line column, Product.
The first part is just to keep looking if empty and I keep js old school for IE 11 users. Product is defined earlier in the js as a variable based on the page, I am using this in many ways. Basically the answer to your question is check for null and instruct to continue, then else if and use indexOf().
for (i = 0; i < data.d.results.length; i++) {
if (data.d.results[i].Product == null) {
continue;
} else if (data.d.results[i].Product.indexOf(product) !== -1) {
var xid = data.d.results[i];
insertText(xid);
}
}
The success function for the GET carries over the xid, insertText(xid). Hopefully this makes sense.
go to site settings > site permissions > anonymous access... you gotta disable the check against Client Object Model permission Requirement

(C) - How would one compare 2 txt files REQUESTS.txt and AVAILABLE.txt, separating each str read into a (STR6, STR3, STR3, INT) formatted Structure?

I have been working on this program for over a week with no breakthrough. The questions states as follows:
A ​disc​ ​file​ ​‘REQUESTS.TXT’​ ​contains​ ​airline​ ​flight​ ​data formatted​
​(STR6,​ ​STR3,​ ​STR3,​ ​INT)​.
Example:​
AA1011​SFx​LAx​​34​ ​(American Airlines​ ​1010,​ ​SF​ ​to​ ​LA,​ ​34​ ​seats)
W0924​DNV​DFW​​101​ ​(Western​ ​0924,​ ​DNV​ ​to​ ​DFW,​ ​101​ ​seats)
Another​ ​file​ ​‘AVAILABL.TXT’​ ​contains​ ​an​ ​unspecified​ number​ ​of​ ​reservation​ request​ ​records formatted​ ​identically​ ​as​ ​described​ ​above​ ​except​ ​the​ Seats​ ​Available​ ​field​ ​is​ ​a​ ​Seats​ ​Requested field.
Guidelines:
Read reservation flights and process requests. If the request can be fullfilled (i.e.. it is in AVAILABL and REQUESTS) then print "Reservation Processed", otherwise print "Reservation Denied".
Print out flight data file before and after reservations are processed, ordered by flight ID in a four(4) column format.
Print an overall outcome report for all processed.(Present totals for the number of requests satisfied and denied)
I have tried a few different approaches.. I tried to split up the first STR6 by isalpha/isdigit and combine them to make the FlightID (AA + 1011). Proceeded to try to then split up the remaining characters between STR3 and STR3 via isalpha + for loop. And lastly, I tried to take the last 3+ digits for the # of seats during each for loop iteration and multiply the first digit by 100(for a 3-digit value) or 10(for a 2-digit value), adding it to a running total for availSeats(INT). This, at least I thought so, would produce a
AA+1011 = AA1011(STR6) // W+0924 = W0924(STR6)
SFx(STR3) // DNV(STR3)
LAx(STR3) // DFW(STR3)
(3*10)+(4*1) = 34(INT) // (1*100)+(0*10)+(1*1) = 101(INT)
All of this stored within a Struct Array.
i.e...
FlightData Flight; ............................................FlightData Flight;
Flight[0].flightID = AA1011; .........................Flight[1].flightID = W0924;
Flight[0].fromCity = SFx; ...............................Flight[1].fromCity = DNV;
Flight[0].toCity = LAx; ..................................Flight[1].toCity = DFW;
Flight[0].seatsAvail = 34; .............................Flight[1].seatsAvail = 101;
I am really at a loss right now and have no other way to progress other than searching up different techniques/methods to use to make this work. I am a beginner clearly and will continue to practice and progress in C, but if anyone could provide me with a push in the right direction on how one would execute this via .txt into a Struct would be amazing. Also, if anyone has another method they used to solve this problem I would love to analyze it. Thanks!
(This is my first post, I spent a lot of time formatting it to be clear on Stackoverflow, so If i messed up in areas some constructive critisism would be useful! This applies to my posting and my coding practices. Thanks again!)
EDIT: The question I am asking here is how to successfully take a string such as AA1011SFxLAx34 and turn it into a Structure like the above diagram. It must also work for the second string W0924DNVDFW101 which has only 1 Char in its ID. (rather than two in AA1011). Im not sure what else I am supposed to edit after reading the guidelines.
I consider this a home work question, so I answer according to
How do I ask and answer homework questions?
Find a tutorial on C, work through it.
Then take a HelloWorld, modify it in small steps to approach your goal in steps from working program to working program. This way you should at least get to being able to read text from a file and print it.
Then learn to store parts of what you print into basic variables.
Then learn about structures.
And so on.
This way you will get quite close to the solution.
If it is not completely what you need show the code you have here at that point and ask a specific question about the first problem explaining what you suspect the problem to be. Show code which has exactly that one problem and makes it visible and has not other warnings (using at least e.g. gcc -Wall mycode).
Fix with the help of commments/answers you receive, repeat.

Two arrayCollection. Only one is an ArrayCollection [duplicate]

since 2 weeks, we are having this problem while trying to flush new elements:
CRITICAL: Doctrine\ORM\ORMInvalidArgumentException:
A new entity was found through the relationship 'Comment#capture' that was not configured to cascade persist operations for entity
But the capture is already in the database, and we are getting it by a findOneBy, so if we cascade persist it, or persist it, we get a
Table constraint violation: duplicate entry.
The comments are created in a loop with differents captures, with a new, and all required field are set.
With all of the entities persisted and / or got by a findOne (and all valid), the flush still fails.
I'm on this issue since a while, so please help me
I had the same problem and it was the same EntityManager. I wanted to insert an object related ManyToOne. And I don't want a cascade persist.
Example :
$category = $em->find("Category", 10);
$product = new Product();
$product->setCategory($category)
$em->persist($product);
$em->flush();
This throws the same exception for me.
So the solution is :
$category = $em->find("Category", 10);
$product = new Product();
$product->setCategory($category)
$em->merge($product);
$em->flush();
In my case a too early call of
$this->entityManager->clear();
caused the problem. It also disappeared by only doing a clear on the recent object, like
$this->entityManager->clear($capture);
My answer is relevant for topic, but not very relevant for your particular case, so for those googling I post this, as the answers above did not help me.
In my case, I had the same error with batch-processing entities that had a relation and that relation was set to the very same entity.
WHAT I DID WRONG:
When I did $this->entityManager->clear(); while processing batch of entities I would get this error, because next batch of entities would point to the detached related entity.
WHAT WENT WRONG:
I did not know that $this->entityManager->clear(); works the same as $this->entityManager->detach($entity); only detaches ALL of the repositorie`s entities.
I thought that $this->entityManager->clear(); also detaches related entities.
WHAT I SHOULD HAVE DONE:
I should have iterated over entities and detach them one by one - that would not detach the related entity that the future entities pointed to.
I hope this helps someone.
First of all, you should take better care of your code, I see like 3 differents indentations in your entity and controller - this is hard to read, and do not fit the Symfony2 coding standards.
The code you show for your controller is not complete, we have no idea from where $this->activeCapture is coming. Inside you have a $people['capture'] which contains a Capture object I presume. This is very important.
If the Capture in $people['capture'] is persisted / fetched from another EntityManager than $this->entityManager (which, again, we do not know from where it come), Doctrine2 have no idea that the object is already persisted.
You should make sure to use the same instance of the Doctrine Entity Manager for all those operations (use spl_object_hash on the EM object to make sure they are the same instance).
You can also tell the EntityManager what to do with the Capture object.
// Refreshes the persistent state of an entity from the database
$this->entityManager->refresh($captureEntity);
// Or
// Merges the state of a detached entity into the
// persistence context of this EntityManager and returns the managed copy of the entity.
$captureEntity = $this->entityManager->merge($captureEntity);
If this does not help, you should provide more code.
The error:
'Comment#capture' that was not configured to cascade persist operations for entity
The problem:
/**
* #ORM\ManyToOne(targetEntity="Capture", inversedBy="comments")
* #ORM\JoinColumn(name="capture_id", referencedColumnName="id",nullable=true)
*/
protected $capture;
dont configured the cascade persist
try with this:
/**
* #ORM\ManyToOne(targetEntity="Capture", inversedBy="comments", cascade={"persist", "remove" })
* #ORM\JoinColumn(name="capture_id", referencedColumnName="id",nullable=true)
*/
protected $capture;
Refreshing the entity in question helped my case.
/* $item->getProduct() is already set */
/* Add these 3 lines anyway */
$id = $item->getProduct()->getId();
$reference = $this->getDoctrine()->getReference(Product::class, $id);
$item->setProduct($reference);
/* Original code as follows */
$quote->getItems()->add($item);
$this->getDoctrine()->persist($quote);
$this->getDoctrine()->flush();
Despite my $item already having a Product set elsewhere, I was still getting the error.
Turns out it was set via a different instance of EntityManager.
So this is a hack of sorts, by retrieving id of the existing product, and then retrieving a reference of it, and using setProduct to "refresh" the whatever connection. I later fixed it by ensuring I have and use only a single instance of EntityManager in my codebase.
I got this error too when tried to add new entity.
A new entity was found through the relationship 'Application\Entity\User#chats'
that was not configured to cascade persist operations for entity: ###.
To solve this issue: Either explicitly call EntityManager#persist() on this unknown entity or
configure cascade persist this association in the mapping for example #ManyToOne(..,cascade={"persist"}).
My case was that I tried to save entity, that shouldn't be saved. Entity relations was filled and tried to be saved (User has Chat in Many2Many, but Chat was a temporary entity), but there were some collisions.
So If I use cascade={"persist"} I get unwanted behaviour - trash entity is saved. My solution was to remove non-saving entity out of any saving entities:
// User entity code
public function removeFromChats(Chat $c = null){
if ($c and $this->chats->contains($c)) {
$this->chats->removeElement($c);
}
}
Saving code
/* some code witch $chat entity */
$chat->addUser($user);
// saving
$user->removeFromChats($chat);
$this->getEntityManager()->persist($user);
$this->getEntityManager()->flush();
I want to tell about my case as that might be helpful to somebody.
Given two entities: AdSet and AdSetPlacemnt. AdSet has the following property:
/**
* #ORM\OneToOne(targetEntity="AdSetPlacement", mappedBy="adSet", cascade={"persist"})
*
* #JMS\Expose
*/
protected $placement;
Then error appears when I try to delete some AdSet objects in a cycle after 1st iteration
foreach($adSetIds as $adSetId) {
/** #var AdSet $adSet */
$adSet = $this->adSetRepository->findOneBy(["id" => $adSetId]);
$this->em->remove($adSet);
$this->em->flush();
}
Error
A new entity was found through the relationship 'AppBundle\Entity\AdSetPlacement#adSet' that was not configured to cascade persist operations for entity: AppBundle\Entity\AdSet#00000000117d7c930000000054c81ae1. To solve this issue: Either explicitly call EntityManager#persist() on this unknown entity or configure cascade persist this association in the mapping for example #ManyToOne(..,cascade={"persist"}). If you cannot find out which entity causes the problem implement 'AppBundle\Entity\AdSet#__toString()' to get a clue.
Solution
The solution was to add "remove" to $placement cascade options to be:
cascade={"persist","remove"}. This guarantees that Placement also becomes detached. Entity manager will "forget" about Placement object thinking of it as "removed" once AdSet is removed.
Bad alternative
When trying to figure out what's going on I've seen a couple answers or recommendations to simply use entity manager's clear method to completely clear persistence context.
foreach($adSetIds as $adSetId) {
/** #var AdSet $adSet */
$adSet = $this->adSetRepository->findOneBy(["id" => $adSetId]);
$this->em->remove($adSet);
$this->em->flush();
$this->em->clear();
}
So that code also works, the issue gets solved but it's not always what you really wanna do. Indeed it's happens quite rarely that you actually need to clear entity manager.

Plotting a word-cloud by date for a twitter search result? (using R)

I wish to search twitter for a word (let's say #google), and then be able to generate a tag cloud of the words used in twitts, but according to dates (for example, having a moving window of an hour, that moves by 10 minutes each time, and shows me how different words gotten more often used throughout the day).
I would appreciate any help on how to go about doing this regarding: resources for the information, code for the programming (R is the only language I am apt in using) and ideas on visualization. Questions:
How do I get the information?
In R, I found that the twitteR package has the searchTwitter command. But I don't know how big an "n" I can get from it. Also, It doesn't return the dates in which the twitt originated from.
I see here that I could get until 1500 twitts, but this requires me to do the parsing manually (which leads me to step 2). Also, for my purposes, I would need tens of thousands of twitts. Is it even possible to get them in retrospect?? (for example, asking older posts each time through the API URL ?) If not, there is the more general question of how to create a personal storage of twitts on your home computer? (a question which might be better left to another SO thread - although any insights from people here would be very interesting for me to read)
How to parse the information (in R)? I know that R has functions that could help from the rcurl and twitteR packages. But I don't know which, or how to use them. Any suggestions would be of help.
How to analyse? how to remove all the "not interesting" words? I found that the "tm" package in R has this example:
reuters <- tm_map(reuters, removeWords, stopwords("english"))
Would this do the trick? I should I do something else/more ?
Also, I imagine I would like to do that after cutting my dataset according to time (which will require some posix-like functions (which I am not exactly sure which would be needed here, or how to use it).
And lastly, there is the question of visualization. How do I create a tag cloud of the words? I found a solution for this here, any other suggestion/recommendations?
I believe I am asking a huge question here but I tried to break it to as many straightforward questions as possible. Any help will be welcomed!
Best,
Tal
Word/Tag cloud in R using "snippets" package
www.wordle.net
Using openNLP package you could pos-tag the tweets(pos=Part of speech) and then extract just the nouns, verbs or adjectives for visualization in a wordcloud.
Maybe you can query twitter and use the current system-time as a time-stamp, write to a local database and query again in increments of x secs/mins, etc.
There is historical data available at http://www.readwriteweb.com/archives/twitter_data_dump_infochimp_puts_1b_connections_up.php and http://www.wired.com/epicenter/2010/04/loc-google-twitter/
As for the plotting piece: I did a word cloud here: http://trends.techcrunch.com/2009/09/25/describe-yourself-in-3-or-4-words/ using the snippets package, my code is in there. I manually pulled out certain words. Check it out and let me know if you have more specific questions.
I note that this is an old question, and there are several solutions available via web search, but here's one answer (via http://blog.ouseful.info/2012/02/15/generating-twitter-wordclouds-in-r-prompted-by-an-open-learning-blogpost/):
require(twitteR)
searchTerm='#dev8d'
#Grab the tweets
rdmTweets <- searchTwitter(searchTerm, n=500)
#Use a handy helper function to put the tweets into a dataframe
tw.df=twListToDF(rdmTweets)
##Note: there are some handy, basic Twitter related functions here:
##https://github.com/matteoredaelli/twitter-r-utils
#For example:
RemoveAtPeople <- function(tweet) {
gsub("#\\w+", "", tweet)
}
#Then for example, remove #d names
tweets <- as.vector(sapply(tw.df$text, RemoveAtPeople))
##Wordcloud - scripts available from various sources; I used:
#http://rdatamining.wordpress.com/2011/11/09/using-text-mining-to-find-out-what-rdatamining-tweets-are-about/
#Call with eg: tw.c=generateCorpus(tw.df$text)
generateCorpus= function(df,my.stopwords=c()){
#Install the textmining library
require(tm)
#The following is cribbed and seems to do what it says on the can
tw.corpus= Corpus(VectorSource(df))
# remove punctuation
tw.corpus = tm_map(tw.corpus, removePunctuation)
#normalise case
tw.corpus = tm_map(tw.corpus, tolower)
# remove stopwords
tw.corpus = tm_map(tw.corpus, removeWords, stopwords('english'))
tw.corpus = tm_map(tw.corpus, removeWords, my.stopwords)
tw.corpus
}
wordcloud.generate=function(corpus,min.freq=3){
require(wordcloud)
doc.m = TermDocumentMatrix(corpus, control = list(minWordLength = 1))
dm = as.matrix(doc.m)
# calculate the frequency of words
v = sort(rowSums(dm), decreasing=TRUE)
d = data.frame(word=names(v), freq=v)
#Generate the wordcloud
wc=wordcloud(d$word, d$freq, min.freq=min.freq)
wc
}
print(wordcloud.generate(generateCorpus(tweets,'dev8d'),7))
##Generate an image file of the wordcloud
png('test.png', width=600,height=600)
wordcloud.generate(generateCorpus(tweets,'dev8d'),7)
dev.off()
#We could make it even easier if we hide away the tweet grabbing code. eg:
tweets.grabber=function(searchTerm,num=500){
require(twitteR)
rdmTweets = searchTwitter(searchTerm, n=num)
tw.df=twListToDF(rdmTweets)
as.vector(sapply(tw.df$text, RemoveAtPeople))
}
#Then we could do something like:
tweets=tweets.grabber('ukgc12')
wordcloud.generate(generateCorpus(tweets),3)
I would like to answer your question in making big word cloud.
What I did is
Use s0.tweet <- searchTwitter(KEYWORD,n=1500) for 7 days or more, such as THIS.
Combine them by this command :
rdmTweets = c(s0.tweet,s1.tweet,s2.tweet,s3.tweet,s4.tweet,s5.tweet,s6.tweet,s7.tweet)
The result:
This Square Cloud consists of about 9000 tweets.
Source: People voice about Lynas Malaysia through Twitter Analysis with R CloudStat
Hope it help!

Resources