Inconvenient web import with flash

Pa
Pavel
Posts: 30
Joined: Wed Apr 11, 2007 1:46 am
Platform: Mac + Linux

Mon Dec 09, 2013 5:25 pm Post

I hope someone can help me. I have used scrivener for several years, but have only a small degree of experience with it due to not really having a pressing reason to use it, as I'm not a writer. This changed a few months ago as I've gone back to school and have taken to using scrivener for making my notes and collecting outlines from instructors. As I've used it more and discover its virtues, I am trying to press it into further and further surface.
Up until a few days ago I have saved web pages of interest in a large folder, with no organization. Of course it does not work well enough to make the effort really worthwhile.

Now to my question, and my dismay. I have been trying to import web pages for a large research paper over the last few days. I'm about to give up on it for one simple reason. When I happen to need to import a site that has embedded flash, which is getting to be the idiotic norm, even if I block the flash add with addblocker, upon import the video is sucked in and then plays when I look at the page, each and every single time. I want to scream! Also it can take upwards of two or three minutes, likely due to my slow internet connection. Is there a way out?

Thank you. I've got a deadline. I value this organizational feature. I don't want to lose my mind. Efforts so far have been in vain. Thanks! :)
Last edited by Pavel on Tue Dec 10, 2013 3:46 am, edited 1 time in total.
To find the answers ... question them!

User avatar
KB
Site Admin
Posts: 20735
Joined: Tue Jun 13, 2006 11:23 pm
Platform: Mac
Location: Truro, Cornwall
Contact:

Mon Dec 09, 2013 6:53 pm Post

Hi,

This is an unfortunate problem with such pages. Scrivener just uses Apple's WebKit for importing and displaying pages (the same as Safari uses), so when it grabs a web page it uses exactly the same techniques as Safari's Save As .webarchive feature. What I would recommend is converting these pages to text. Select them in the binder and use Documents > Convert > Web Page(s) to Text. This will strip out any videos and just leave you with the textual content and images. The only downside to this is that web pages with complex layout may need a bit of tidying up.

All the best,
Keith
"You can't waltz in here, use my toaster, and start spouting universal truths without qualification."

Pa
Pavel
Posts: 30
Joined: Wed Apr 11, 2007 1:46 am
Platform: Mac + Linux

Mon Dec 09, 2013 8:07 pm Post

Thank you very much for the reply. At least I know that I don't have to hunt around for some sort of setting. Some pages, such as forbes, have odd tricks up their sleeve where for example any imported page shows up at a previous page and you have to click to get to the page saved.
It seems like the whole web has become annoyance ware.

Well I can live with this for now and will look to doing as you suggested, which is the ideal in any case - to have only the text, so it will be worth some effort, once my deadline passes.

I have two further questions pertaining to this. If imported, and then the site changes the contents, will scrivener keep the original that was saved? I ask this because inside of scrivener the content seems interactive, and I'd hate to have it so interactive that it could disappear or change. :)

Secondly, if I download and purchase Devonthink, will it capture web pages in the same way?

A few frustrations, mostly due to my unfamiliarity with the program, aside - what a marvelous productivity booster Scrivener is! :D Thanks again.
To find the answers ... question them!

User avatar
KB
Site Admin
Posts: 20735
Joined: Tue Jun 13, 2006 11:23 pm
Platform: Mac
Location: Truro, Cornwall
Contact:

Mon Dec 09, 2013 9:50 pm Post

Pavel wrote:I have two further questions pertaining to this. If imported, and then the site changes the contents, will scrivener keep the original that was saved? I ask this because inside of scrivener the content seems interactive, and I'd hate to have it so interactive that it could disappear or change. :)


The content will stay the same. I believe it's just that WebKit's .webarchive downloader doesn't download *everything*, so certain adverts and suchlike will still get called from online. You can see what is saved offline by closing the project, turning off your internet connection, and then loading the project again - you'll see that the interactive elements that were being called from online don't load, but everything else will be there.

Secondly, if I download and purchase Devonthink, will it capture web pages in the same way?


To the best of my knowledge, Devonthink does save web pages the same way, yes.

By the way, there is a third way - you could export from Safari as PDF (go to Print the page and then choose to save it as PDF from the Print panel), then you could import the PDF file into Scrivener.

A few frustrations, mostly due to my unfamiliarity with the program, aside - what a marvelous productivity booster Scrivener is! :D Thanks again.


Glad you're liking it!

All the best,
Keith
"You can't waltz in here, use my toaster, and start spouting universal truths without qualification."

User avatar
robertdguthrie
Posts: 3075
Joined: Mon Nov 09, 2009 10:06 pm
Platform: Mac
Location: St. Louis, MO, USA
Contact:

Mon Dec 09, 2013 9:57 pm Post

KB wrote:By the way, there is a third way - you could export from Safari as PDF (go to Print the page and then choose to save it as PDF from the Print panel), then you could import the PDF file into Scrivener.

I like to use Safari's "Reader" button in the URL bar when available. You can mouse to the bottom of the page to "print" the cleaned-up article as a PDF. If that doesn't work, I'd suggest setting up Evernote and the browser "clip to evernote" plugin for it; it has an option to just clip the article (if it can isolate that part), which you should be able to import into Scrivener later without those pesky adverts.
Often wrong, rarely in doubt.
Time for a change... I'm now rdale; same dog-avatar, same dog... channel?

Ch
ChiaLynn
Posts: 38
Joined: Wed Nov 16, 2011 2:11 am
Platform: Mac
Contact:

Mon Dec 09, 2013 10:47 pm Post

As KB said, DevonThink does bring in web pages with all their horrible Flash intact, if you choose to import them as Web Archive or HTML. (Web Archive is static, so if the page changes, you'll still have the version you downloaded.) However, it can also import as a either a paginated PDF or a single-page PDF. It will also display the link to the original page, if you need to include it in your citations or find it again.

I just recently discovered that if you add Scrivener to the File-->Print-->PDF dialogue in Firefox and Safari, it will send a PDF'd webpage directly to the Research folder of your frontmost project. (You can do it in Chrome, too, but you need to ask to Print Using System Dialogue.) That doesn't preserve the original URL, though, which DevonThink, Evernote, and Scrivener all will.

Pa
Pavel
Posts: 30
Joined: Wed Apr 11, 2007 1:46 am
Platform: Mac + Linux

Tue Dec 10, 2013 3:51 am Post

Thank you everyone! The print as PDF sound like a great idea to try, and try it I shall. This place is as therapeutic a real latte. More perhaps ... and sweet while sugar free!
To find the answers ... question them!

as
asotir
Posts: 190
Joined: Sun Jun 24, 2012 10:38 pm
Platform: Mac

Tue Dec 10, 2013 1:46 pm Post

Two thoughts here:

First, print as PDF will as noted preserve images (not moving) and layout so you can read the info. But this freezes the site as of the moment you save it. This for research purposes could be what you want, or not.

Second, you could save the site using another browser. Try Firefox, save the complete page, and then you can edit the html file and delete the flash contents. Once you have done that, you can use scrivener to import the page. But again, this will only give you the contents of the page as of the moment you save it.

You could also select and copy the text and images you want for research, and paste them into a new document you create in your project research folder. This will give you the info you want - again, as of the moment you copy it - but has the added benefit of being editable. Then you could simply paste in the URL at the top or the bottom of the document, which should give you a link to the current, changing, page. Just a link though, and you would then click through to the page whenever you wanted updated, changed versions.

Just remember with a program as powerful as Scrivener, there is always another way to skin the cat.

- asotir