API Basics

API activation is done by sending a request to the URL like http://convextra.com/extapi/%methodName%/, where %methodName% is the name of the invoked method. Additional parameters can be transmitted by GET or POST methods at the client's choice. The response will be a JSON format object. When calling the API it is required to transfer the parameter apiKey (you can get it on your profile settings page), which contains your Private Key.


API Methods

Below is the description of each method including their input parameters. All the required parameter are highlighted as bold.


apiGetData

Method is used to extract data from the page.

URL: http://convextra.com/extapi/apiGetData/

Parameters:

  • (string) url - page URL for data extraction
  • (string) html - page html-code (if html parameter is specified then url parameter will not be used for loading page code).
  • (string) standartStamp - a unique ID of data-set for parsing. If standartStamp is not specified then the system will detect the most probable data-set. In order to get available data-sets use the apiAnalyzePage method.
  • (range 0..100) optAccuracy (80 by default) - setup option. Determines the minimum requirement for elements similarity upon which they belong to the same data-set. Available values are 0..100.
  • (string) document_cookie - cookie (as a string), that will be transferred when loading page that is specified in url parameter. This parameter does not make sense if you use the html parameter.
  • (bool) csv - allows to get parsing results as a link to CSV file.


Return values:
  • parsedData - array of strings of extracted data. Each string’s element (a cell of resulting table) has its properties:
    • data - textual representation of the cell
    • linkedTo - indicates the address if the cell is a link
    • isImage - indicates if the cell is an image
    • label - the column name which a cell belongs to
    • propertyIdentifier - a unique column ID which a cell belongs to
    • csvUrl - a link to the CSV file with parsing results.


Console testing:



apiAnalyzePage

Method is used for extracting data from a webpage.

URL: http://convextra.com/extapi/apiAnalyzePage/

Parameters:

  • (string) url - page URL for data extraction
  • (string) html - page html-code (if html parameter is specified then url parameter will not be used for loading page code).
  • (range 0..100) optAccuracy (80 by default) - setup option. Determines the minimum requirement for elements similarity upon which they belong to the same data-set. Available values are 0..100.
  • (string) document_cookie - cookie (as a string), that will be transferred when loading page that is specified in url parameter. This parameter does not make sense if you use the html parameter.
  • (bool) optDetectPagination (0, 1) - if optDetectPagination is equal to 1 there will be an attempt to determine the pagination scheme automatically.


Return values:
  • set - array of data-sets that has been determined. Each data set has its properties:
    • standardStamp - unique data-set ID
    • items - array of data-set’s elements (with XPATH)
    • isImage - indicates if a cell is an image
    • label - the column name which a cell belongs to
    • propertyIdentifier - a unique column ID which a cell belongs to
  • pagination - information about found pagination. This has properties:
    • scheme - a link template of pages
    • step - pagination step
    • range - pages range
    • currentPage - current page


Console testing: