Quantcast
Channel: The Unofficial Scrapebox FAQ
Viewing all 240 articles
Browse latest View live

What is the “M” button in the harvester section?

$
0
0

What is the "M" button in the harvester section?

It lets you Merge a .txt file with your keyword list.  So if you wanted to pull a site: a list of urls, you can load the urls in the keyword section and then merge a file that contains site: with it.  It will add whatever is in the .txt file infront of any keywords in the list.

You can include multiple lines in the .txt file and it will take each keyword and add it to each line in the .txt file.


Error 500 when commenting to WordPress?

$
0
0

Error 500 when commenting to Wordpress?

This means your data is bad, most likely your emails are invalid like no @ symbol or no .com on the end or similar.

If you export failed entries from posting, then repost to the failed ones, why do more succeed the 2nd time around?

$
0
0

If you export failed entries from posting, then repost to the failed ones, why do more succeed the 2nd time around?

There are many reason for this behavior.
- The site could have been off line or overloaded the first time around
- The connection could have timed out
- The proxy used the first time could have been bad

All this could mean that a 2nd. run could post some additional comments.  There are too many variables to try and control every time.  Best practice to maximize a run, is to lower connection and increase timeouts.  If you need maximum links from your runs, plan on reposing to the failed.

How can I scrape all the urls for a domain?

$
0
0

How can I scrape all the urls for a domain?

You can trim the url to root, so you have the domain homepage.  Then you can use the site operator to scrape the addtional urls.

site:http://www.rootdomain.com

Tip:  When using search operators, its important that you understand what operators do and how they work with the different engines.  Some Google searches on search operators will lend a lot of useful information.

Fast and Slow Poster Error Codes

$
0
0

Fast and Slow Poster Error Codes

Login,Failed - This means that the blog requires you to login in order to post a comment.  Since Scrapebox does not support this, the post failed.

 

What tokens can be used in the Learning Mode Addon Poster?

$
0
0

What tokens can be used in the Learning Mode Addon Poster?

The Learning Mode Poster Addon can use the below tokens.  Note: these tokens will not work with Fast and Slow poster in the main scrapebox window.  For the tokens that work with fast and slow poster, see here.

  • %USERNAME% will be replaced with the users name from the file loaded in for usernames.
  • %USEREMAIL%  will be replaced with the users email from the file loaded in for emails.
  • %USERURL% will be replaced with the users website from the file loaded in the user url slot.

How do I specify the Anchor I want to use for my links?

$
0
0

How do I specify the Anchor I want to use for my links?

Whatever you place in the names section is what will be used as your anchor.

How do you lock a specific anchor/name to a specific url?

$
0
0

How do you lock a specific anchor/name to a specific url?

You can lock a specific anchor to a specific url when posting.  This allows you to post to many urls with different anchors in the same posting run.   This feature is called Link Lock.

You use link lock in the websites.txt file.  It should be formatted like this:

http://www.site.com {anchor1|anchor2|etc..}
http://www.site1.com {anchor1|anchor2|etc..}

You still are required place something in the names.txt file, as this is the backup file that will be used if something goes wrong.  For instance if your spin syntax was messed up, scrapebox may not be able to understand what anchor you wanted, so it would use a line from the names file.


Does scrapebox come with proxies to use?

$
0
0

Does scrapebox come with proxies to use?

Scrapebox has a proxy harvester, where it will go and scrape proxies from websites for you.  It comes included with some sources, but more importantly it gives you the option to add your own custom sources.

How do I add my own custom proxy sources?

$
0
0

How do I add my own custom proxy sources?

In the Select Engines & Proxies section click:

Manage Proxies -> Harvest Proxies -> Add Source

You can add both HTTP and SOCKS proxy sources.

Proxy sources must be a regular web page that lists proxies and does not require a login.   For instance, if the list is contained in a flash popup or a downloadable list it won't work.  Only a regular HTML style web page will work.

Additionally the proxies have to be updated on the very url you specify.  So if you specify a home page of a site that lists new proxies, if the newest proxies are listed on the home page it will work.  If the proxies are listed on new sub pages of the site, scrapebox will Not spider the site to find the pages with the proxies. So you need to find a static url that is updated.

What are the minimum system requirements for running Scrapebox?

$
0
0

What are the minimum system requirements for running Scrapebox?

Scrapebox doesn't have an officially stated minimum requirements.  However generally speaking its not terribly taxing on a system.

It does require the Windows OS.  It has been tested on Windows XP, Windows Vista and Windows 7, as well as Windows 2003 and 2008 server.

Generally speaking I have seen it run on some pretty minimal systems and VPS, however when you want to run more then 1 instance it obviously begins to consume more resources for each instance you run.  Also working with larger files will use more resources.  So the more you want to use it the more powerful of a system you will need.

Scrapebox says X number were successful when posting, but the links aren’t there

$
0
0

Scrapebox says X number were successful when posting, but the links aren't there

If you have ever commented on a blog manually then you might remember that sometimes you will get a notification from the blog saying that the comment was received.  However many times you get no notification from the blog.  When scrapebox receives that notification from the blog that the comment was received it reports it as successful.  When it receives no notification it assumes that it failed.  This does not mean that it did in fact fail, but scrapebox can not determine if it was successful or not.   Many of the blogs that show failed in scrapebox, actually received your comment.

The blog commenter is just that, a commenter.  It places the comments.  Most comments will then go to moderation and be approved or denied.  Some comments will be placed on auto approve blogs. Which is where the comment is automatically approved.  So the successful post, just means the comment was submitted.  It does not mean that the link is live on the site, that will be up to the admin to approve it or deny it.  Unless it is auto approve, in which case it goes live immediately.  If you check links right after you comment, those are going to be the ones that are auto approve.

How do I scrape all of the backlinks/inlinks to any given url?

$
0
0

How do I scrape all of the backlinks/inlinks to any given url?

You can scrape all the backlinks to any given url (max 1000 limit per the engines api limitation) by using the link: operator.

link:http://www.site.com/fullurl

At this time, using this feature in Yahoo is the same as using Yahoo Site Explorer.

If I reformat my hard drive or get a new computer what do I do?

$
0
0

If I reformat my hard drive or get a new computer what do I do?

Simply re-download Scrapebox and reinstall scrapebox.  When you load the program simply click the Activate button, just like you did the first time.  Then fill out the info that is required and check the transfer license button, and hit submit.

(You can re-download scrapebox here: http://www.scrapebox.com/payment-received)

Error Codes

$
0
0

Error Codes

You may encounter error codes as you use scrapebox.

Most error codes are simply standard internet error codes that you would get if you were using a regular web browser. Examples would be anything:
1xx, 2xx, 3xx, 4xx, 5xx

Error code 0 -  generally means you have bad proxies or you have misconfigured something in scrapebox

For a explanation of error codes choose the help menu in scrapebox:

Help -> Server Error Code Reference


Error 999

$
0
0

Error 999

When trying to access a Yahoo service (i.e. Yahoo Mail, Yahoo Groups, Yahoo News, Yahoo Search or Siteexplorer), you receive the following error message:

Unable to process request at this time — error 999 Unfortunately, we are unable to process your request at this time. We apologize for the inconvenience. Please try again later.

Operating System: Any.

Background: This error appears to be a “catch-all” error code that Yahoo serves up when it doesn’t have a more specific error code. It essentially means “Oops! Something went wrong but we don’t know what, so we’ll just say that Error 999 occurred.”

The most common reason for receiving Yahoo Error 999 is due to some sort of bandwidth limiting system that Yahoo has put in place on their servers. Once you have exceeded your allotted bandwidth for a specific period of time Yahoo gives you this Error 999 message and doesn’t allow you to access the service. People have primarily reported receiving this error when they try to access Yahoo Mail or Yahoo Groups, but other Yahoo services may also be affected.

Why has Yahoo done this? There are two reasons that I can think of:

1. To prevent DoS (Denial of Service) attacks.
2. To stop automated tasks from hammering their servers with hundreds of requests a second.

There are many programs around that offer to automate access to various Yahoo services, i.e. check your Yahoo mailbox every 5 minutes, archive Yahoo Groups messages, download files from the Yahoo Groups Photos and Files sections, etc. If you use one of these automated tools then there is a very real possibility that you will run into the Error 999 message. Normal human usage of the Yahoo services shouldn’t normally generate enough traffic to trigger the Error 999 message unless you’re a very heavy user.

It appears that Yahoo uses your IP address to track the amount of traffic you’re generating on Yahoo, and once you reach the limit you get blocked by the Unable to process request at this time — error 999 message. Once triggered you will find that your IP address has been blocked for a period of time, somewhere between 2 and 24 hours usually.

How do I use tokens with the M (merge) option?

$
0
0

How do I use tokens with the M (merge) option?

Ok let me explain the "M" merge a little, if you have the following keywords in the ScrapeBox keyword box:

black
white

If you hit the "M" to load a footprint file and it contains this:

"powered by wordpress" %KW%
intitle:%KW%
intitle:%KW% inurl:%KW%

The outcome will be:

"powered by wordpress" black
"powered by wordpress" white
intitle:black
intitle:white
intitle:black inurl:black
intitle:white inurl:white

I hope that makes sense, if not just experiment with it.

How does the link checker work?

$
0
0

How does the link checker work?

The link checker checks strictly for text.  So if you put in:

Hi how are you?

It will check the html code on all the pages you load up for that phrase.

When you put in:

http://www.site.com

It checks for that text, which would appear if there was a link on the page.

Where are harvested results stored?

$
0
0

Where are harvested results stored?

Urls are stored in 1 million chunks in the following folder:

Scrapebox Folder>> Harvester Sessions >> Harvester_ XXX_XXX

Each session creates a new folder.  You may want to delete these from time to time.

Is there a monthly cost for Scrapebox?

$
0
0

Is there a monthly cost for Scrapebox?

No, not at this time.

Viewing all 240 articles
Browse latest View live