Scraping based on "nested property" - css-selectors

After having created a few different spiders I thought I could scrape practically anything, but I've hit a roadblock.
Given the following code snippet:
<div class="col-md-4">
<div class="tab-title">Homepage</div>
<p>
<a target="_blank" rel="nofollow"
href="http://www.bitcoin.org">http://www.bitcoin.org
</a>
</p>
</div>
How would you go about selecting the link that is in within <a ... </a> based on the text within the tab-title div?
The reason that I require that condition is because there are several other links that fit this condition:
response.css('div.col-md-4 a::attr(href)').extract()
My best guess is the following:
response.css('div.col-md-4 div.tab-title:contains("Homepage") a::attr(href)').extract()
Any insights are appreciated! Thank you in advance.
Note: I am using Scrapy.

How about this using XPath:
response.xpath('//div[#class="tab-title" and contains(., "Homepage")]/..//a/#href')
Find a div with class tab-title which contains Homepage inside, then step up to the parent and look for a child on any level.
EDIT:
Using CSS, you should be able to do it like this:
response.css('div.tab-title:contains("Homepage") ~ * a::attr(href)')

Related

Have my code act as a p tag and not as a link if there is no URL

I am relatively new to angularJs so I am trying to learn how to do different things. I have been trying to make solutionName act as a p tag if there is no URL input for solutionUrl1, at the moment solutionName is acting as if it is hyperlinked even when its not. Any help would be appreciated.
<a ng-href="{{::data.solutionUrl1}}" class="card__title" style="text-align: center">
<span>{{::data.solutionName}}</span>
</a>
Use ng-if of angularjs to render either one or the other:
Something like this, you most probably have to change the condition to meet your needs. You can also create a new Variable in the JS files like showLink and set this variable to true/false depending on some conditions. And then just use this boolean variable to show/hide the link with the method outlined below:
<div ng-if="data.solutionUrl1">
<!-- code to render the link-->
</div>
<div ng-if="!data.solutionUrl1">
<!-- code to render just the span without the link -->
</div>

How to process an array (or manually process ng-repeat)?

Using: AngularJS v1.3.15
Disclaimer: I know virtually nothing about angularjs. But I'm "forced" to use it because its being used in a framework that I am using.
I want to modify some html/angularjs that looks like this:
<ul>
<li ng-repeat="provider in model.externalProviders">
<a class="pure-button" href="{{provider.href}}">{{provider.text}}</a>
</li>
</ul>
I can see what is going on here... ng-repeat causes an iteration on the elements of the model.externalProviders collection/array. It works fine, but I have no control over content/styling individual <a> elements depending on the provider. I would like to change the content/appearance of the <a> element depending on type.
The relevant part of the model looks like this:
"externalProviders": [
{
"type": "Google",
"text": "Sign-in with Google",
"href": "https://localhost:44302/external?provider=Google&signin=04e029cf1018403f1757b097fbfb1ecb"
}
],
So I thought maybe there is a way to "select" or "pick" from externalProviders by type... If that type exists, then render the appropriate markup, e.g.:
<ul>
<!-- if model.externalProviders has item with type=="Google"... -->
<li>
<a class="pure-button button.google" href="<i class="fab fa-google"></i>{{provider.href}}">{{provider.text}}</a>
</li>
<!-- if model.externalProviders has item with type=="Facebook"... -->
<li>
<a class="pure-button button.facebook" href="<i class="fab fa-facebook"></i>{{provider.href}}">{{provider.text}}</a>
</li>
</ul>
Not sure what the proper search terms would be so I had trouble finding any info that might solve my problem. Is something like this possible with AngularJS? If so, how would I accomplish it?
As #Major Sam commented, the ngClass might work for less simple scenarios, but I don't even need to go that far. Luckily I have control over the type property and the css, so I can make my type and css selector match the font awesome icon class selector for the icon. This works:
<li ng-repeat="provider in model.externalProviders">
<a class="pure-button button-{{provider.type}}" href="{{provider.href}}"><i class="fab fa-{{provider.type}}"></i>{{provider.text}}</a>
</li>
Drawbacks: Doesn't allow you to change the actual markup (like, e.g., not include the icon if font awesome didn't have one for that provider).

How to click on a parent whose child contains a particular text

I am new to webdriver and I am automating a site. In which I have to find a text on that page and I want to click on its parent div. Can anyone please help?
Below is the HTML code.
<div class="col-md-6 col-sm-6 col-lg-6">
<p>
<strong class="detailShow ng-binding" ng-click="showJobs('NonInvite',item.taskId)"> TEST CR1 START DATE </strong>
</p>
</div>
The easiest way to achieve this is directly using xpath.
You can choose the 'more ugly' way like this:
.//strong[contains(#class,'detailShow')]/../..
Explanation:
/.. - this is how you get the parent element. Having just one, means that you'll get the <p> tag, so to get the <div> you need another one.
Or, you can go in a better manner like this:
.//strong[contains(#class,'detailShow')]/ancestor::div[#class='col-md-6 col-sm-6 col-lg-6']
Explanation:
ancestor::div - this one goes up until and returns all parent that are <div> tags. Since you need only the first <div> parent, you need to specify which one, therefore: ancestor::div[#class='col-md-6 col-sm-6 col-lg-6']
Now, if the bootstrap class is your only identifier, and there is a chance you may be inside a <table> you can also go with ancestor::div[1]

ckeditor strips <span> and style attributes

I have a Drupal 7 site using ckeditor 4.2. I've created a basic page node and put a span inside an h2 heading in the body. I hard coded it in the html view. It looks fine but if I go back to edit the page, my has gotten stipped out of the html and also any style="" I've put into the html also. I've looked at the ckeditor config and text-formats. I've set the only formats allowed to be text and full html so I'm not using filtered at all. What gives? I've used the editor many times before but probably not this version.
If you are using the CKeditor module then there is an option in Advanced Options that is also mentioned in the module homepage where you should set:
config.allowedContent = true;
None of the above solutions worked for me. What I found was that CKEditor was removing empty <span> tags from the HTML. For example:
<div class="section-heading">
<span class="sep-holder-l"><span class="sep-line"></span></span>
<h4>Section Header</h4>
<span class="sep-holder-r"><span class="sep-line"></span></span>
</div>
Would yield:
<div class="section-heading">
<h4>Section Header</h4>
</div>
However, if I added a non-breaking space in the innermost <span>, CKEditor didn't edit the HTML:
<div class="section-heading">
<span class="sep-holder-l"><span class="sep-line"> </span></span>
<h4>Section Header</h4>
<span class="sep-holder-r"><span class="sep-line"> </span></span>
</div>
Hopefully that helps someone out there!
In Drupal 7 there's no automatic synchronization between CKEditor's filter (called the Advanced Content Filter) and Drupal's filter. As I understand you configured latter one, but not the first one. See config.extraAllowedContent.
CKEditor 4.+ will remove any empty tags it finds which are in CKEDITOR.dtd.$removeEmpty as part of the HTML parsing process.
See this answer for a hack to avoid it.

Conditionally change img src based on model data

I want to represent model data as different images using Angular but having some trouble finding the "right" way to do it. The Angular API docs on expressions say that conditional expressions are not allowed...
Simplifying a lot, the model data is fetched via AJAX and shows you the status of each interface on a router. Something like:
$scope.interfaces = ["UP", "DOWN", "UP", "UP", "UP", "UP", "DOWN"]
So, in Angular, we can display the state of each interface with something like:
<ul>
<li ng-repeat=interface in interfaces>{{interface}}
</ul>
BUT - Instead of the values from the model, I'd like to show a suitable image. Something following this general idea.
<ul>
<li ng-repeat=interface in interfaces>
{{if interface=="UP"}}
<img src='green-checkmark.png'>
{{else}}
<img src='big-black-X.png'>
{{/if}}
</ul>
(I think Ember supports this type of construct)
Of course, I could modify the controller to return image URLs based on the actual model data but that seems to violate the separation of model and view, no?
This SO Posting suggested using a directive to change the bg-img source. But then we are back to putting URLs in the JS not the template...
All suggestions appreciated. Thanks.
please excuse any typos
Instead of src you need ng-src.
AngularJS views support binary operators
condition && true || false
So your img tag would look like this
<img ng-src="{{interface == 'UP' && 'green-checkmark.png' || 'big-black-X.png'}}"/>
Note : the quotes (ie 'green-checkmark.png') are important here. It won't work without quotes.
plunker here (open dev tools to see the produced HTML)
Another alternative (other than binary operators suggested by #jm-) is to use ng-switch:
<span ng-switch on="interface">
<img ng-switch-when="UP" src='green-checkmark.png'>
<img ng-switch-default src='big-black-X.png'>
</span>
ng-switch will likely be better/easier if you have more than two images.
Another way ..
<img ng-src="{{!video.playing ? 'img/icons/play-rounded-button-outline.svg' : 'img/icons/pause-thin-rounded-button.svg'}}" />
<ul>
<li ng-repeat=interface in interfaces>
<img src='green-checkmark.png' ng-show="interface=='UP'" />
<img src='big-black-X.png' ng-show="interface=='DOWN'" />
</li>
</ul>
For angular 4 I have used
<img [src]="data.pic ? data.pic : 'assets/images/no-image.png' " alt="Image" title="Image">
It works for me , I hope it may use to other's also for Angular 4-5. :)

Resources