I would like to remove things from the main page but the information I need is stored on the next site. How do I approach this?
This is stored on the first page where I can see all articles and all links in href have different numbers, sometimes numbers and letters.
<div class="NewsArticle">
<div class="featured-content-image">
<a href="/27312/72410214/" rel="bookmark">
<img class="imageclass" referrerpolicy="no-referrer" data-original"https/domainname.com/2021/0/Article01.jpg" src="https/domainname.com/2021/0/Article01.jpg" alt="Article01" style="display: inline;">
<div class="link-overlay"></div>
</a>
<span class"article-views">170392 Views</span>
</div>
This is on the second site where, for example, "military" is stored for whatever reason. Is it possible to remove the articles that contain "military"?
<a title="military" href="/category/military" rel="tag" style="margin-right:3px;margin-bottom:3px;" class="btn btn-info btn-md">military</a>
using jquery you can do something like
$("div").remove()
this selects all the divs and removes them
or you can do
$( "div:contains('military')" ).remove()
this looks at all divs if the divs have military in it it is then removed
IMPORTANT:
Make sure to have
// #require https://ajax.googleapis.com/ajax/libs/jquery/3.6.0/jquery.min.js
And declare the variable
/* globals jQuery, $, waitForKeyElements */
Related
After having created a few different spiders I thought I could scrape practically anything, but I've hit a roadblock.
Given the following code snippet:
<div class="col-md-4">
<div class="tab-title">Homepage</div>
<p>
<a target="_blank" rel="nofollow"
href="http://www.bitcoin.org">http://www.bitcoin.org
</a>
</p>
</div>
How would you go about selecting the link that is in within <a ... </a> based on the text within the tab-title div?
The reason that I require that condition is because there are several other links that fit this condition:
response.css('div.col-md-4 a::attr(href)').extract()
My best guess is the following:
response.css('div.col-md-4 div.tab-title:contains("Homepage") a::attr(href)').extract()
Any insights are appreciated! Thank you in advance.
Note: I am using Scrapy.
How about this using XPath:
response.xpath('//div[#class="tab-title" and contains(., "Homepage")]/..//a/#href')
Find a div with class tab-title which contains Homepage inside, then step up to the parent and look for a child on any level.
EDIT:
Using CSS, you should be able to do it like this:
response.css('div.tab-title:contains("Homepage") ~ * a::attr(href)')
I am new to webdriver and I am automating a site. In which I have to find a text on that page and I want to click on its parent div. Can anyone please help?
Below is the HTML code.
<div class="col-md-6 col-sm-6 col-lg-6">
<p>
<strong class="detailShow ng-binding" ng-click="showJobs('NonInvite',item.taskId)"> TEST CR1 START DATE </strong>
</p>
</div>
The easiest way to achieve this is directly using xpath.
You can choose the 'more ugly' way like this:
.//strong[contains(#class,'detailShow')]/../..
Explanation:
/.. - this is how you get the parent element. Having just one, means that you'll get the <p> tag, so to get the <div> you need another one.
Or, you can go in a better manner like this:
.//strong[contains(#class,'detailShow')]/ancestor::div[#class='col-md-6 col-sm-6 col-lg-6']
Explanation:
ancestor::div - this one goes up until and returns all parent that are <div> tags. Since you need only the first <div> parent, you need to specify which one, therefore: ancestor::div[#class='col-md-6 col-sm-6 col-lg-6']
Now, if the bootstrap class is your only identifier, and there is a chance you may be inside a <table> you can also go with ancestor::div[1]
I have an Angular application using (UI) Bootstrap and I'm displaying a URL in a modal-body inside well and code elements. The modal-footer just has a Close button. Now if one triple-clicks the URL to copy it, the Close button text in the footer is also selected and one can not paste it to another tab as usual.
Fiddle here
<div class="modal-body">
<div class="well">
<code>http://www.google.com/</code>
</div>
</div>
<div class="modal-footer">
<button type="button" ng-click="close()" class="btn btn-primary" data-dismiss="modal">Close</button>
</div>
I have searched a lot but cannot come up with a solution (have tried CSS's user-select, wrapping it in different elements, which do not seem to work).
Replace <button> by <input> to disable text selection:
<input type="button" ng-click="close()" class="btn btn-primary" data-dismiss="modal" value="Close"></input>
Updated Fiddle
Not exactly a scientific or technical answer (as I don't fully grasp the browser's Selection API); but what I have observed is that when you triple click the last child element in a DOM branch, it creates a hanging selection (for lack of a better term).
By hanging selection, I mean that it tries to look for the next selectable text element, but since it's the last child that was selected, it traverses the DOM to find the next text element. In my specific case, I was dealing with something like this:
<div>
<h2>Section Header</h2>
<div>
<p>foo</p>
<p>bar</p><!-- triple clicking this would select the next h2 content!! -->
</div>
</div>
<div>
<h2>Next Section Header</h2>
...
</div>
The solution I arrived at came from the naive thought of "give it something else to select" so that it wouldn't go looking elsewhere in the DOM. So this definitely feels more like a hacky workaround, but simply inserting 0-height <br> elements (so as not not affect the existing layout) immediately after any content that I expect users may try to triple click solved all of my issues.
So in OP's case
<div class="modal-body">
<div class="well">
<code>http://www.google.com/</code>
<div style="line-height: 0px;"><!-- line-height is what affects br height, and you can't apply line-height directly on br tags for whatever reason -->
<br />
</div>
</div>
</div>
<div class="modal-footer">
<button type="button" ng-click="close()" class="btn btn-primary" data- dismiss="modal">Close</button>
</div>
At the time of posting this, there isn't much info about these types of issues so I figured I would share what worked for me in hopes that it helps someone else.
One good source of info I did find was in the form of a github issue comment here
It turns out there are a bunch of ways to cause hanging selections (because of the nature of the functionality I'm building, I could only verify that my solution works for #1 and #4):
triple-click on a paragraph
place the cursor at the beginning of a paragraph, shift-arrow-down past the end of the paragraph
place the cursor at the beginning of a paragraph, shift-arrow-right to the end of the paragraph (selects the text within the paragraph), then shift-arrow-right again (selects the whole paragraph)
drag the cursor from the beginning of a paragraph to the end (selects the text within the paragraph), then drag a little further / down (selects the whole paragraph)
I still want {{1+2}} to be evaluated as normal. But in addition to the normal interpolator, I want to create a custom one that I can program to do whatever I want.
e.g. <p>[[welcome_message]]</p> should be a shortcut for <p>{{'welcome_message' | translate}}</p>, or <p translate="welcome_message"></p>. That way, i18n apps would be much more convenient to write.
Is anything like that possible? I'm currently going through the angular.js source code, but the interpolation system looks pretty complicated(?). Any suggestions?
I created a directive that regex-find-replaces it's innerHTML. Basically, I can rewrite any text into any other text. Here's how I did it:
How do I make angular.js reevaluate / recompile inner html?
Now all I have to do is to place my directive-attribute, "autotranslate", in one of the parent elements of where I want my interpolator to work, and it rewrites it however I want it! :D
<div class="panel panel-default" autotranslate>
<div class="panel-heading">[[WELCOME]]</div>
<div class="panel-body">
[[HELLO_WORLD]
</div>
</div>
becomes
<div class="panel panel-default" autotranslate>
<div class="panel-heading"><span translate="WELCOME"></span></div>
<div class="panel-body">
<span translate="HELLO_WORLD"></span>
</div>
</div>
which does exactly what I wanted.
I don't think that's possible, but if you really want to save some characters you could create a function on your rootScope called t, then call it within your views:
<p>{{ t(welcome_message) }}</p>
I have a Drupal 7 site using ckeditor 4.2. I've created a basic page node and put a span inside an h2 heading in the body. I hard coded it in the html view. It looks fine but if I go back to edit the page, my has gotten stipped out of the html and also any style="" I've put into the html also. I've looked at the ckeditor config and text-formats. I've set the only formats allowed to be text and full html so I'm not using filtered at all. What gives? I've used the editor many times before but probably not this version.
If you are using the CKeditor module then there is an option in Advanced Options that is also mentioned in the module homepage where you should set:
config.allowedContent = true;
None of the above solutions worked for me. What I found was that CKEditor was removing empty <span> tags from the HTML. For example:
<div class="section-heading">
<span class="sep-holder-l"><span class="sep-line"></span></span>
<h4>Section Header</h4>
<span class="sep-holder-r"><span class="sep-line"></span></span>
</div>
Would yield:
<div class="section-heading">
<h4>Section Header</h4>
</div>
However, if I added a non-breaking space in the innermost <span>, CKEditor didn't edit the HTML:
<div class="section-heading">
<span class="sep-holder-l"><span class="sep-line"> </span></span>
<h4>Section Header</h4>
<span class="sep-holder-r"><span class="sep-line"> </span></span>
</div>
Hopefully that helps someone out there!
In Drupal 7 there's no automatic synchronization between CKEditor's filter (called the Advanced Content Filter) and Drupal's filter. As I understand you configured latter one, but not the first one. See config.extraAllowedContent.
CKEditor 4.+ will remove any empty tags it finds which are in CKEDITOR.dtd.$removeEmpty as part of the HTML parsing process.
See this answer for a hack to avoid it.