wpipe

A univeral asset storage for the toolserver, so tools can run in a pipeline!

Every asset gets an asset ID, through which it can be queried. Assets might expire after some time; using an asset resets that time. Assets are public if you have the asset ID. Assets can store text-based data, but should be JSON (see below). All functions listed below return JSON or JSONP (add callback URL parameter).

Actions

Action parameters you can use on this page.

action=reserve
Returns a new asset ID. The asset status will be 'reserved'. Can optionally take app (application name) and note parameters, which will be stored with the asset.
action=start
Requires asset parameter. Changes the asset status to 'running'.
action=done
Requires asset parameter. Stores a result. You can either
- pass a file name without path in the file parameter, if you have already created it in the datadir directory. No other directories will work!
- pass the actual data in the raw parameter. You should really use POST for this one!
Changes the asset status to 'done'. The asset can not be changed from this point on.
action=store
Executes reserve, start, and done in one go. Asset ID is returned. Parameters for reserve and done can be used. Changes the asset status to 'done'.
action=fail
Requires asset parameter. Changes the asset status to 'failed'. The asset can not be changed from this point on.
action=status
Requires asset parameter. Returns the status of a given asset ID
action=info
Requires asset parameter. Returns metadata for a given asset ID
action=datadir
Returns the data directory path on the toolserver. A toolserver app can store a result in that directory, then call done or store with the file parameter.
action=result
Requires asset parameter. Returns the data associated with the asset, if 'done'. The actual result will be served as a string.

Resources

Suggested JSON format

This is a (strong) suggestion for the data format that gets stored in assets, if that data refers to Wiki(m|p)edia pages. This is intended to make a wide range of tools compatible with each other.

{
	"type":"pagelist",
	"app":"The name of the application that generated this data",
	"projects": [
		{
			"project":"wikipedia",
			"language":"en",
			"namespaces":{
				"1":"Talk"
			} ,
			"pages" : [ {
				"title":"Main_Page",
				"ns":1,
				"fulltitle":"Talk:Main_Page"
			} ]
		}
	]
}
	

Notes

PHP API class

Here's a convenient PHP abstraction of the API:
<?PHP

class Wpipe {

	public $wpipe_url = 'https://toolserver.org/~magnus/wpipe' ;

	// Reserves a new asset; returns new asset ID, or NULL in case of error
	public function reserve ( $app = '' , $note = '' ) {
		$v = json_decode ( file_get_contents ( $this->wpipe_url . '?action=reserve&app='.urlencode($app).'&note='.urlencode($note) ) ) ;
		if ( $v->error == 'OK' ) return $v->asset ;
		return NULL ;
	}
	
	// Returns the result data for an asset, optionally parsed as JSON, or NULL in case of error
	public function getResult ( $asset , $parse_json = true ) {
		$d = file_get_contents ( $this->wpipe_url . '?action=result&asset=' . urlencode($asset) ) ;
//		$d = exec ( "/home/magnus/wpipe/wpipe.pl  result --asset $asset" ) ;
//		print "!" . $d . "!" ;
//		if ( !$parse_json ) return $d ;
		$d = json_decode ( $d ) ;
		if ( $d->error != 'OK' ) return NULL ;
		if ( $parse_json ) $d = json_decode ( $d->result ) ;
		else $d = $d->result ;
		return $d ;
	}
	
	// Returns the path to store your data files in
	public function getDataDir () {
		if ( !isset ( $this->datadir ) ) {
			$v = json_decode ( file_get_contents ( $this->wpipe_url . '?action=datadir' ) ) ;
			$this->datadir = $v->datadir ;
		}
		return $this->datadir ;
	}
	
	// When done, this function stores the passed result object as a JSON string in a temporary file
	public function done ( $asset , $result ) {
		$datadir = $this->getDataDir() ;
		$tmpfname = tempnam($datadir, "TMP");
		$handle = fopen($tmpfname, "w");
		fwrite($handle, json_encode ( $result ));
		fclose($handle);
		$file = array_pop ( explode ( '/' , $tmpfname ) ) ;
		$this->doneFile ( $asset , $file ) ;
	}
	
	// When done, this function sets a file as the result data
	public function doneFile ( $asset , $file ) {
		file_get_contents ( $this->wpipe_url . '?action=done&asset='.urlencode($asset).'&file='.urlencode($file) ) ;
	}

	// Fails the asset; in case it cannot be created due to error etc.
	public function fail ( $asset ) {
		file_get_contents ( $this->wpipe_url . '?action=fail&asset='.urlencode($asset) ) ;
	}

	// Tags the asset as started
	public function start ( $asset ) {
		file_get_contents ( $this->wpipe_url . '?action=start&asset='.urlencode($asset) ) ;
	}
	
	// END OF CORE FUNCTIONS. EVERYTING BELOW THIS LINE IS JUST FOR ADDITIONAL CONVENIENCE. No need to copy if you already have similar functions yourself.
	//____________________________________________________________________________________________________________________________________________
	
	
	//____________________________________________________________________________________________________________________________________________
	// CONVENIENCE FUNCTIONS

	private $command_line_parameters ;
	
	// Determines if the tool is run on the web, or on command line
	public function isWebUse () {
		return isset($_SERVER['HTTP_USER_AGENT']) ;
	}

	// Get database connection, with ot without (refault) user DB
	function getDBcon ( $language , $project , $mysql_user = '' , $mysql_password = '' , $dbmode = "rrdb" ) { // 'rrdb' or 'userdb'
		if ( $mysql_user == '' ) $mysql_user = get_current_user() ;
		if ( $mysql_password == '' ) $mysql_password = $this->getDatabasePassword() ;
		$dbname = $this->getDBname ( $language , $project ) ;
		$server = str_replace ( '_' , '-' , $dbname ) . ".$dbmode.toolserver.org" ;
		if ( !$mysql_con = @mysql_connect ( $server , $mysql_user , $mysql_password ) ) {
			return null ;
		}
		if ( !mysql_select_db ( $dbname , $mysql_con ) ) return null ;
		mysql_select_db ( $this->getDBname ( $language , $project ) , $mysql_con ) ;
		return $mysql_con ;
	}
	
	// Get your MySQL database password, if it's stored in ~/.my.cnf
	public function getDatabasePassword () {
		$mysql_password = '' ;
		$pwd_file = "/home/" . get_current_user() . "/.my.cnf" ;
		if ( file_exists ( $pwd_file ) ) {
			$t = file_get_contents ( $pwd_file ) ;
			$lines = explode ( "\n" , $t ) ;
			foreach ( $lines AS $l ) {
				$l = trim ( $l ) ;
				if ( substr ( $l , 0 , 8 ) != 'password' ) continue ;
				$l = explode ( '"' , $l ) ;
				array_shift ( $l ) ;
				$mysql_password = array_shift ( $l ) ;
			}
		}
		return $mysql_password ;
	}

	public function getDBname ( $language , $project = 'wikipedia' ) {
		$ret = $language ;
		if ( $language == 'commons' ) $ret = 'commonswiki_p' ;
		else if ( $project == 'wikipedia' ) $ret .= 'wiki_p' ;
		else $ret .= $project . '_p' ;
		return $ret ;
	}
	

	
	// Returns a parameter, either from URL (prefered), or command line; can fall back on default value if passed
	public function getParam ( $key , $default ) {
		// Try web params
		if ( isset($_REQUEST[$key]) ) return $_REQUEST[$key] ;
		
		// Try command line params
		if ( !isset($this->command_line_parameters) ) {
			$this->command_line_parameters = $this->parseParameters() ;
		}
		if ( isset($this->command_line_parameters[$key]) ) return $this->command_line_parameters[$key] ;
		
		// Fall back on default
		return $default ;
	}
	
	// Returns the full name of a page, by etiher using 'fullpage', or reconstruction from the project object namespaces
	public function getFullName ( $page , $project ) { // $project is the object containing the namespaces!
		if ( isset($page->fullname) ) return $page->fullname ;
		$ns = $page->ns ;
		if ( isset($project->namespaces[$ns]) && $project->namespaces[$ns] != '' ) {
			return $project->namespaces[$ns] . ':' . $page->title ;
		}
		return $page->title ;
	}

	//____________________________________________________________________________________________________________________________________________
	// PRIVATE FUNCTIONS
	
	// From http://www.php.net/manual/en/function.getopt.php#83414
	private function parseParameters($noopt = array()) {
        $result = array();
        if ( !isset($GLOBALS['argv']) ) return $result;
        $params = $GLOBALS['argv'];
        // could use getopt() here (since PHP 5.3.0), but it doesn't work relyingly
        reset($params);
        while (list($tmp, $p) = each($params)) {
            if ($p{0} == '-') {
                $pname = substr($p, 1);
                $value = true;
                if ($pname{0} == '-') {
                    // long-opt (--<param>)
                    $pname = substr($pname, 1);
                    if (strpos($p, '=') !== false) {
                        // value specified inline (--<param>=<value>)
                        list($pname, $value) = explode('=', substr($p, 2), 2);
                    }
                }
                // check if next parameter is a descriptor or a value
                $nextparm = current($params);
                if (!in_array($pname, $noopt) && $value === true && $nextparm !== false && $nextparm{0} != '-') list($tmp, $value) = each($params);
                $result[$pname] = $value;
            } else {
                // param doesn't belong to any option
                $result[] = $p;
            }
        }
        return $result;
    }

    
}

?>

To make your tools asset-compatible, read data from 'source_asset', and store the output data in 'target_asset', respectively, if passed as parameter. It will help if tools can be run from command line as well as web; just use 'getopt' as a fallback if URL parameters are empty.