EnglishFrenchSpanish

Ad


OnWorks favicon

webStraktor download for Linux

Free download webStraktor Linux app to run online in Ubuntu online, Fedora online or Debian online

This is the Linux app named webStraktor whose latest release can be downloaded as webStraktor-20140420-R01.zip. It can be run online in the free hosting provider OnWorks for workstations.

Download and run online this app named webStraktor with OnWorks for free.

Follow these instructions in order to run this app:

- 1. Downloaded this application in your PC.

- 2. Enter in our file manager https://www.onworks.net/myfiles.php?username=XXXXX with the username that you want.

- 3. Upload this application in such filemanager.

- 4. Start the OnWorks Linux online or Windows online emulator or MACOS online emulator from this website.

- 5. From the OnWorks Linux OS you have just started, goto our file manager https://www.onworks.net/myfiles.php?username=XXXXX with the username that you want.

- 6. Download the application, install it and run it.

SCREENSHOTS

Ad


webStraktor


DESCRIPTION

webStraktor is a programmable World Wide Web data extraction client. Its purpose is to scrape HTML based content via the HTTP protocol and extract relevant information. webStraktor features a scripting language to facilitate the collection, the extraction and the storage of information available on the web, including images. The scripting language uses elements of the Regular Expression and xPath syntax. The webStraktor scripting language has a small instruction set and its syntax is easy to master.
The standard webStraktor output format is XML based, either in ASCII, UTF-8 or ISO-8859-1 (Latin1) code pages.
webStraktor relies on the Apache HttpClient for retrieving content via the HTTP protocol. It adheres to the Robots Exclusion Protocol and it can be configured to operate in an anonymous way by connecting to the predominant types of web proxy servers.
webStraktor extends the functionality of web crawlers, spiders or bots by integrating scraping and crawling capabilities.



Features

  • programmable web crawler (web spider or web bot)
  • easy to master scripting language
  • java swing based graphical development environment
  • UTF8 or ISO-8859-1 XML output
  • integrates with readily available scheduling applications
  • exhaustive configuration
  • web proxy server support
  • robot exclusion protocol support
  • configurable User Agent signature
  • step by step tutorial and example scripts
  • Apache HttpClient based


Audience

Developers, Architects


User interface

Java Swing


Programming Language

Java


Database Environment

XML-based


This is an application that can also be fetched from https://sourceforge.net/projects/webstraktor/. It has been hosted in OnWorks in order to be run online in an easiest way from one of our free Operative Systems.


Free Servers & Workstations

Download Windows & Linux apps

  • 1
    Metal detector based on  RP2040
    Metal detector based on RP2040
    Based on Raspberry Pi Pico board, this
    metal detector is included in pulse
    induction metal detectors category, with
    well known advantages and disadvantages.
    RP...
    Download Metal detector based on RP2040
  • 2
    PAC Manager
    PAC Manager
    PAC is a Perl/GTK replacement for
    SecureCRT/Putty/etc (linux
    ssh/telnet/... gui)... It provides a GUI
    to configure connections: users,
    passwords, EXPECT regula...
    Download PAC Manager
  • 3
    GeoServer
    GeoServer
    GeoServer is an open-source software
    server written in Java that allows users
    to share and edit geospatial data.
    Designed for interoperability, it
    publishes da...
    Download GeoServer
  • 4
    Firefly III
    Firefly III
    A free and open-source personal finance
    manager. Firefly III features a
    double-entry bookkeeping system. You can
    quickly enter and organize your
    transactions i...
    Download Firefly III
  • 5
    Apache OpenOffice Extensions
    Apache OpenOffice Extensions
    The official catalog of Apache
    OpenOffice extensions. You'll find
    extensions ranging from dictionaries to
    tools to import PDF files and to connect
    with ext...
    Download Apache OpenOffice Extensions
  • 6
    MantisBT
    MantisBT
    Mantis is an easily deployable, web
    based bugtracker to aid product bug
    tracking. It requires PHP, MySQL and a
    web server. Checkout our demo and hosted
    offerin...
    Download MantisBT
  • 7
    LAN Messenger
    LAN Messenger
    LAN Messenger is a p2p chat application
    for intranet communication and does not
    require a server. A variety of handy
    features are supported including
    notificat...
    Download LAN Messenger
  • More »

Linux commands

  • 1
    abi-compliance-checker
    abi-compliance-checker
    abi-compliance-checker - tool to
    compare ABI compatibility of shared
    C/C++ library versions DESCRIPTION:
    NAME: ABI Compliance Checker
    (abi-compliance-checker) ...
    Run abi-compliance-checker
  • 2
    abi-dumper
    abi-dumper
    abi-dumper - a tool to dump ABI of an
    ELF object containing DWARF debug info
    DESCRIPTION: NAME: ABI Dumper
    (abi-dumper) Dump ABI of an ELF object
    containing DW...
    Run abi-dumper
  • 3
    convert_seq
    convert_seq
    convert_seq - conversion of sequence
    and alignment formats ...
    Run convert_seq
  • 4
    convert_sym
    convert_sym
    convert_sym - convert a Viewlogic
    symbol/schematic to gEDA gschem format ...
    Run convert_sym
  • 5
    g15macro
    g15macro
    g15macro - A simple Macro
    recording/playback application for
    G15Daemon DESCRIPTION: This package
    provides a way to record, playback and
    display keyboard macro ...
    Run g15macro
  • 6
    g15mpd
    g15mpd
    g15mpd - A simple frontend for the MPD
    Media Player Daemon, for use with
    g15daemon DESCRIPTION: This package
    provides a fontend for MPD Media Player
    Daemon, di...
    Run g15mpd
  • More »

Ad