Skip navigation.
Home
Semantic Software Lab
Concordia University
Montréal, Canada

Blogroll

New York Machine Learning Deadlines

Machine Learning Blog - Tue, 2016-01-26 08:27

There’s a number of different Machine Learning related paper deadlines that may interest.

January 29 (abstract) for March 4 New York ML Symposium Register early because NYAS can only fit 300. January 27 (abstract)/February 2 (paper) for July 9-15 IJCAI The biggest AI conference February 5(paper) for June 19-24 ICML Nina and Kilian have 850 well-vetted reviewers. Marek and Peder have increased space to allow 3K people. February 12(paper) for June 23-26 COLT Vitaly and Sasha are program chairs. February 12(proposal) for June 23-24 ICML workshops Fei and Ruslan are the workshop chairs. I really like workshops. February 19(proposal) for June 19 ICML tutorials Bernhard and Alina have invited a few tutorials already but are saving space for good proposals as well. March 1(paper) for June 25-29 UAI Jersey City isn’t quite New York, but it’s close enough May ~2 for June 23-24 ICML workshops Varies with the workshop.
Categories: Blogroll

Interesting things at NIPS 2015

Machine Learning Blog - Mon, 2015-12-14 14:43

NIPS is getting big. If you think of each day as a conference crammed into a day, you get a good flavor of things. Here are some of the interesting things I saw.

Two other notable events happened during NIPS.

  1. The Imagenet challenge and MS COCO results came out. The first represents a significant improvement over previous years (details here).
  2. The Open AI initiative started. Concerned billionaires create a billion dollar endowment to advance AI in a public(NOT Private) way. What will be done better than NSF (which has a similar(ish) goal)? I can think of many possibilities.

See also Seb’s post.

Categories: Blogroll

Ready to connect to the Semantic Web – now what?

Semantic Web Company - Wed, 2015-12-02 04:35

As an open data fan or as someone who is just looking to learn how to publish data on the Web and distribute it through the Semantic Web you will be facing the question “How to describe the dataset that I want to publish?” The same question is asked also by people who apply for a publicly funded project at the European Commission and want to have a Data Management plan. Next we are going to discuss possibilities which help describe the dataset to be published. 

The goal of publishing the data should be to make it available for access or download and to make it interoperable. One of the big benefits is to make the data available for software applications which in turn means the datasets have to be machine-readable. From the perspective of a software developer some additional information than just name, author, owner, date…  would be helpful:

  • the condition for re-use (rights, licenses)
  • the specific coverage of the dataset (type of data, thematic coverage, geographic coverage)
  • technical specifications to retrieve and parse an instance (a distribution) of the dataset (format, protocol)
  • the features/dimensions covered by the dataset (temperature, time, salinity, gene, coordinates)
  • the semantics of the  features/dimensions (unit of measure, time granularity, syntax, reference taxonomies)

To describe a dataset the best is always to look first at existing standards and existing vocabularies. The answer is not found looking only at one vocabulary but at several.

Data Catalog Vocabulary (DCAT)

DCAT is an RDF Schema vocabulary for representing data catalogs. It is an RDF vocabulary for describing any dataset, which can be standalone or part of a catalog.

Vocabulary of Interlinked Datasets (VoID)

VoID is an RDF vocabulary, and a set of instructions, that enable the discovery and usage of linked data sets. VOID is an RDF vocabulary for expressing metadata about RDF datasets.

Data Cube vocabulary

Data Cube vocabulary is focused purely on the publication of multi-dimensional data on the web. It is an RDF vocabulary for describing statistical datasets.

Asset Description Metadata Schema (ADMS)

ADMS is a W3C standard developed in 2013 and is a profile of DCAT, used to describe semantic assets.

You will find only partial answers of how to describe your dataset in existing vocabularies while some aspects are missing or complicated to express.

  1. Type of data – there is no specific property for the type of data covered in a dataset. This value should be machine readable which means it should be standardized, possibly to an URI which can be de-reference-able to a thing. And this ‘thing’ should be part of an authority list/taxonomy which is not existing yet. However one can use the adms:representationTechnique, which gives more information about the format in which a dataset is released. This points only to dcterms:format and dcat:mediaType.
  2. Technical properties like – format, protocol etc.
    • There is no property for protocol and again these values should be machine-readable, standardized possibly to an URI.
    • VoID can help with the protocol metadata but only for RDF datasets: dataDump, sparqlEndpoint.
  3. Dimensions of a dataset.
    • SDMX defines a dimension as “A statistical concept used, in combination with other statistical concepts, to identify a statistical series or single observations.” Dimensions in a dataset can therefore be called features, predictors, or variables (depending on the domain). One can use dc:conformsTo and use a dc:Standard if the dataset dimensions can be defined by a formalized standard. Otherwise statistical vocabularies can help with this aspect which can become quite complex. One can use the Data Cube vocabulary specificallyqd:DimensionProperty, qd:AttributeProperty, qd:MeasureProperty, qd:CodedProperty in combination with skos:Concept and sdmx:ConceptRole.
  4. Data provenance – there is the dc:source that can be used at dataset level but there is no solution if we want to specify the source at data record level.

In the end one needs to combine different vocabularies to best describe a dataset.

The tools out there used for helping in publishing data seem to be missing one or more of the above mentioned parts.

  • CKAN maintained by the Open Knowledge Foundation uses most of DCAT and doesn’t describe dimensions.
  • Dataverse created by Harvard University uses a custom vocabulary and doesn’t describe dimensions.
  • CIARD RING uses full DCAT AP with some extended properties (protocol, data type) and local taxonomies with URIs mapped when possible to authorities.
  • OpenAIRE, DataCite (using re3data to search repositories) and Dryad use their own vocabularies.

The solution to these existing issues seem to be in general, introducing custom vocabularies.

References:
Categories: Blogroll

CNTK and Vowpal Wabbit tutorials at NIPS

Machine Learning Blog - Sun, 2015-11-29 10:33

Both CNTK and Vowpal Wabbit have pirate tutorials at NIPS. The CNTK tutorial is 1 hour during the lunch break of the Optimization workshop while the VW tutorial is 1 hour during the lunch break of the Extreme Multiclass workshop. Consider dropping by either if interested.

CNTK is a deep learning system started by the speech people who started the deep learning craze and grown into a more general platform-independent deep learning system. It has various useful features, the most interesting of which is perhaps efficient scalable training. Using GPUs with allreduce and one-bit sgd it achieves both high efficiency and scalability over many more GPUs than could ever fit into a single machine. This capability is unique amongst all open deep learning codebases so everything else looks nerfed in comparison. CNTK was released in April so this is the first chance for many people to learn about it. See here for more details.

The Vowpal Wabbit tutorial just focuses on what is new this year.

  1. The learning to search framework has greatly matured and is now easily used to solve ad-hoc joint(structured) prediction problems. The ICML tutorial covers algorithms/analysis so this is about using the system.
  2. VW has also become the modeling element of a larger system (called the decision service) which gathers data and uses it as per Contextual Bandit learning. This is now generally usable, and is the first general purpose system of this sort.
Categories: Blogroll

If you like “Friends” you probably also will like “Veronica’s Closet” (find out with SPARQL why)

Semantic Web Company - Fri, 2015-11-06 07:40

In a previous blog post I have discussed the power of SPARQL to go beyond data retrieval to analytics. Here I look into the possibilities to implement a product recommender all in SPARQL. Products are considered to be similar if they share relevant characteristics, and the higher the overlap the higher the similarity. In the case of movies or TV programs there are static characteristics (e.g. genre, actors, director) and dynamic ones like viewing patterns of the audience.

The static part of this we can look up in resources like the DBpedia. If we look at the data related to the resource <http://dbpedia.org/resource/Friends> (that represents the TV show “Friends”) we can use for example the associated subjects (see predicate dcterms:subject). In this case we find for example <http://dbpedia.org/resource/Category:American_television_sitcoms> or <http://dbpedia.org/resource/Category:Television_shows_set_in_New_York_City> If we want to find other TV shows that are related to the same subjects we can do this with the following query:

click to get code

The query can be exectuted at the DBpedia SPARQL endpoint http://live.dbpedia.org/sparql (default graph http://dbpedia.org). Read from the inside out the query does the following:
  1. Count the number of subjects related to TV show “Friends”.
  2. Get all TV shows that share at least one subject with “Friends” and count how many they have in common.
  3. For each of those related shows count the number of subjects they are related to.
  4. Now we can calculate the relative overlap in subjects which is (number of shared subjects) / (numbers of subjects for “Friends” + number of subjects for other show – number of common subjects).

This gives us a score of how related one show is to another one. The results are sorted by score (the higher the better) and these are the results for “Friends”:

showB subjCount ShowAB subjCount ShowA subjCount ShowB subj Score Will_&_Grace 10 16 18 0.416667 Sex_and_the_City 10 16 21 0.37037 Seinfeld 10 16 23 0.344828 Veronica’s_Closet 7 16 12 0.333333 The_George_Carlin_Show 6 16 9 0.315789 Frasier 8 16 18 0.307692

In the fist line of the results we see that “Friends” is associated with 16 subjects (that is the same in every line), “Will & Grace” with 18, and they share 10 subjects. That results into a score of 0.416667. Other characteristics to look at are actors starring a show, the creators (authors), or executive producers.

We can pack all this in one query and retrieve similar TV shows based on shared subjects, starring actors, creators, and executive producers. The inner queries retrieve the shows that share some of those characteristics, count numbers as shown before and calculate a score for each dimension. The individual scores can be weighted, in the example here the creator score is multiplied by 0.5 and the producer score by 0.75 to adjust the influence of each of them.

click to get code

 This results into:

showB subj Score star Score creator Score execprod Score integrated Score The_Powers_That_Be_(TV_series) 0.17391 0.0 1.0 0.0 0.1684782608 Veronica’s_Closet 0.33333 0.0 0.0 0.428571 0.1636904761 Family_Album_(1993_TV_series) 0.14285 0.0 0.666667 0.0 0.1190476190 Jesse_(TV_series) 0.28571 0.0 0.0 0.181818 0.1055194805 Will_&_Grace 0.41666 0.0 0.0 0.0 0.1041666666 Sex_and_the_City 0.37037 0.0 0.0 0.0 0.0925925925 Seinfeld 0.34482 0.0 0.0 0.0 0.0862068965 Work_It_(TV_series) 0.13043 0.0 0.0 0.285714 0.0861801242 Better_with_You 0.25 0.0 0.0 0.125 0.0859375 Dream_On_(TV_series) 0.16666 0.0 0.333333 0.0 0.0833333333 The_George_Carlin_Show 0.31578 0.0 0.0 0.0 0.0789473684 Frasier 0.30769 0.0 0.0 0.0 0.0769230769 Everybody_Loves_Raymond 0.30434 0.0 0.0 0.0 0.0760869565 Madman_of_the_People 0.3 0.0 0.0 0.0 0.075 Night_Court 0.3 0.0 0.0 0.0 0.075 What_I_Like_About_You_
(TV_series) 0.25 0.0 0.0 0.0625 0.07421875 Monty_(TV_series) 0.15 0.14285 0.0 0.0 0.0732142857 Go_On_(TV_series) 0.13043 0.07692 0.0 0.111111 0.0726727982 The_Trouble_with_Larry 0.19047 0.1 0.0 0.0 0.0726190476 Joey_(TV_series) 0.21739 0.07142 0.0 0.0 0.0722049689

Each line shows the individual scores for each of the predicates used and in the last column the final score. You can also try out the query with “House” <http://dbpedia.org/resource/House_(TV_series)> or “Suits” <http://dbpedia.org/resource/Suits_(TV_series)> and get shows related to those.

This approach can be used for any similar data, too, where we want to obtain similar items based on characteristics they share. One could for example compare persons (by e.g. profession, interests, …), or consumer electronic products like photo cameras (resolution, storage, size or price range).

Categories: Blogroll

ADEQUATe for the Quality of Open Data

Semantic Web Company - Tue, 2015-11-03 05:19

The ADEQUATe project builds on two observations: An increasing amount of Open Data becomes available as an important resource for emerging businesses and furtheron the integration of such open, freely re-usable data sources into organisations’ data warehouse and data management systems is seen as a key success factor for competitive advantages in a data-driven economy.

The project now identifies crucial issues which have to be tackled to fully exploit the value of open data and the efficient integration with other data sources:

  1. the overall quality issues with meta data and the data itself
  2. the lack of interoperability between data sources

The projects approch is now to address this point already in an early stage – when the open data is freshly provided by either governmental organisations or others.

The ADEQUATe project works with a combination of data and community driven approaches to address the above mentioned challenges. This include 1) the continuously assessment of Data Quality of Open Data Portals based on a comprehensive list of quality metrics, 2) the application of a set of (semi)-automatic algorithms in combination with crowdsourcing approaches to improve identified quality issues and 3) the use of Semantic Web Technologies to transform legacy Open Data sources (mainly common text formats) into Linked Data.

So the project intends to research and develop novel automated and community-driven data quality improvement techniques and then integrate pilot implentations into existing Open Data portals (data.gv.at and opendataportal.at).  Furtheron a quality assessment & monitoring framework will evaluate and demonstrate the impact of the ADEQUATe solutions for the above mentioned business case.

About: ADEQUATe is funded by the Austrian FFG under the Programme ICT of the Future. The Project is run by Semantic Web Company together with Institute for Information Business of Vienna University of Economics & Business and the Department for E-Governance and Administration at the Danube University Krems. The Project started by August 2015 and will run to March 2018.

 

 

Categories: Blogroll

ICML 2016 in NYC and KDD Cup 2016

Machine Learning Blog - Fri, 2015-10-30 18:41

ICML 2016 is in New York City. I expect it to be the largest ICML by far given the destination—New York is the place which is perhaps easiest to reach from anywhere in the world and it has the largest machine learning meetup anywhere in the world.

I am the general chair this year, which is light in work but heavy in responsibilities. Some things I worry about:

  1. How many people will actually come? Numbers are difficult to guess with the field growing and the conference changing locations. I believe we need capacity for at least 3000 people based on everything I know.
  2. New York is expensive. What can be done about it? One thought is that we should actively setup a roommate finding system so the costs of hotels can be shared. Up to 3 people can share a hotel room for the conference hotel (yes, each with their own bed), and that makes the price much more reasonable. I’m also hoping donations will substantially defray the cost. If others have creative ideas, I’m definitely interested.

Markus Weimer also points out the 2016 KDD Cup which has a submission deadline of December 6. KDD Cup datasets have become common reference for many machine learning papers, so this is a good way to get your problem solved well by many people.

Categories: Blogroll

Agile Web Mining at Bing

Data Mining Blog - Fri, 2015-10-30 18:35

The web search industry is making great progress in transitioning from building tools for finding pages and sites to building tools that leverage and surface facts and knowledge. The local search space - where I work in Bing - is founded on structured knowledge - the entity data that represents businesses and other things that necessarily have a location, and is a core piece of the knowledge space required for this future.

Over the past few years, my team has been working on mining the web for information about local entities. This data now helps to power a significant percentage of local search interactions in a number of countries around the world.

As we have been working on this system, we have come to think deeply about how to build systems for web mining, but also how to construct efficient developer workflows and how to add data management components to these systems to take advantage of human input when appropriate.

These processes constitute what I term Agile Web Mining, the core principles of which are: optimize for developer productivity, optimize for data management and invest in low latency systems. So much of what we hear about in the industry currently revolves around very large data sets (big data) which often entail long processing times and high latency interactions. In contrast, we tend to think of our data in a different way, where the size of data is relatively small (on the order of the size of a web site), but where there are many examples of these small data sets.

We are currently growing our team, and so if you are interested in learning more about Agile Web Mining, please get in touch with me.

Categories: Blogroll

Ensure data consistency in PoolParty

Semantic Web Company - Fri, 2015-10-09 10:12

Semantic Web Company and its PoolParty team are participating in the H2020 funded project ALIGNED. This project evaluates software engineering and data engineering processes in the context of how these both worlds can be aligned in an efficient way. All project partners are working on several use cases, which shall result in a set of detailed requirements for combined software and data engineering. The ALIGNED project framework also includes work and research on data consistency in PoolParty Thesaurus Server (PPT).

ALIGNED: Describing, finding and repairing inconsistencies in RDF data sets

When using RDF to represent the data model of applications, inconsistencies can occur. Compared with the schema approach of relational databases, a data model using RDF offers much more flexibility. Usually, the application’s business logic produces and modifies the model data and, therefore, can guarantee the consistency needed for its operations. However, information may not only be created and modified by the application itself but may also originate from external sources like RDF imports into the data model’s triple store. This may result in inconsistent model data causing the application to fail. Therefore, constraints have to be specified and enforced to ensure data consistency for the application. In Phase 1 of the ALIGNED project, we outline the problem domain and requirements for the PoolParty Thesaurus Server use case with the goal of establishing a solution for describing, finding and repairing inconsistencies in RDF data sets. We propose a framework as a basis for integrating RDF consistency management into PoolParty Thesaurus Server software components. The approach is a work in progress that aims for adopting technologies developed by the ALIGNED project partners and refine them for usage in an industrial-strength application.

Technical View

Users of PoolParty often wish to import arbitrary datasets, vocabularies, or ontologies. But these datasets do not always meet these constraints PoolParty impose. Currently, when users attempt to import data which violates the constraints, the data will simply fail to display, or in the worst case, cause unexpected behaviour and lead to/reflect errors in the application. Enhanced PoolParty will feedback the user why the import has failed, suggest ways in which the user can fix the problem and also identify potential new constraints that could be applied to the data structure. Apart from the import functionality, different other software components, like the taxonomy editor, or the reasoning engine drive RDF data constraints and vice versa. The following figure outlines utilization and importance of data consistency constraints in the PoolParty application:

click for larger view

Approaches and solutions for many of these components already exist. However, the exercise within ALIGNED is to integrate them in an easy-to-use way to comply with the PoolParty environment. Consistency constraints, for example, can be formulated using RDF Data Shapes or interpreting RDFS/OWL constructs with constraints-based semantics. RDFUnit already partly supports these techniques. Repair strategies and curation interfaces are covered by the Seshat Global History Databank project. Automated repair of large datasets can be managed by the UnifiedViews ETL tool, whereas immediate notification on data inconsistencies can be disseminated via the rsine semantic notification framework.

Outlook

Within the ALIGNED projects, all project partners demand simple (i.e. maintainable and usable) data quality and consistency management and work on solutions to meet their requirements. Our next steps will encompass research on how to apply these technologies to the PoolParty problem domain, and to take part in unifying and integrating the different existing tools and approaches. The immediate challenge to address will be to build an interoparable catalog of formalized PoolParty data consistency constraints and repair strategies so that they are machine-processable in a (semi-)automatic way.

Categories: Blogroll

SPARQL analytics proves boxers live dangerously

Semantic Web Company - Tue, 2015-09-29 09:30

You have always thought that SPARQL is only a query language for RDF data? Then think again, because SPARQL can also be used to implement some cool analytics. I show here two queries that demonstrate that principle.

For simplicity we use a publicly available dataset of DBpedia on an open SPARQL endpoint: http://live.dbpedia.org/sparql (execute with default graph = http://dbpedia.org).

Mean life expectancy for different sports

The query shown here starts from the class dbp:Athlete and retrieves sub classes thereof that cover different sports. With that athletes of that areas are obtained and their birth and death dates (i.e. we only take into account deceased individuals). From the dates the years are extracted. Here a regular expression is used because the SPARQL function to extract years from a literal of a date type returned errors and could not be used. From the birth and death years the age is calculated (we filter for a range of 20 to 100 years because in data sources like this erroneous entries have always to be accounted for). Then the data is simply grouped and we count for each sport the number of athletes that were selected and the average age they reached.

prefix dbp:<http://dbpedia.org/ontology/>
select ?athleteGroupEN (count(?athlete) as ?count) (avg(?age) as ?ageAvg)
where {
filter(?age >= 20 && ?age <= 100) .
{
select distinct ?athleteGroupEN ?athlete (?deathYear - ?birthYear as ?age)
where {
?subOfAthlete rdfs:subClassOf dbp:Athlete .
?subOfAthlete rdfs:label ?athleteGroup filter(lang(?athleteGroup) = "en") .
bind(str(?athleteGroup) as ?athleteGroupEN)
?athlete a ?subOfAthlete .
?athlete dbp:birthDate ?birth filter(datatype(?birth) = xsd:date) .
?athlete dbp:deathDate ?death filter(datatype(?death) = xsd:date) .
bind (strdt(replace(?birth,"^(\\d+)-.*","$1"),xsd:integer) as ?birthYear) .
bind (strdt(replace(?death,"^(\\d+)-.*","$1"),xsd:integer) as ?deathYear) .
}
}
} group by ?athleteGroupEN having (count(?athlete) >= 25) order by ?ageAvg

The results are not unexpected and show that athletes in the area of motor sports, wresting and boxing die at younger age. On the other hand horse riders, but also tennis and golf players live on average clearly longer.

athleteGroupEN count ageAvg wrestler 693 58.962481962481962 winter sport Player 1775 66.60169014084507 tennis player 577 71.483535528596187 table tennis player 45 68.733333333333333 swimmer 402 68.674129353233831 soccer player 6572 63.992391965916007 snooker player 25 70.12 rugby player 1452 67.272038567493113 rower 69 63.057971014492754 poker player 30 66.866666666666667 national collegiate athletic association athlete 44 68.090909090909091 motorsport racer 1237 58.117219078415521 martial artist 197 67.157360406091371 jockey (horse racer) 139 65.992805755395683 horse rider 181 74.651933701657459 gymnast 175 65.805714285714286 gridiron football player 4247 67.713680244878738 golf player 400 71.13 Gaelic games player 95 70.589473684210526 cyclist 1370 67.469343065693431 cricketer 4998 68.420368147258904 chess player 45 70.244444444444444 boxer 869 60.352128883774453 bodybuilder 27 52 basketball player 822 66.165450121654501 baseball player 9207 68.611382643640708 Australian rules football player 2790 69.52831541218638

This is especially relevant when that data is large and one would have to extract it from the database and import it into another tool to do the counting and calculations.

Simple statistical measures over life expectancy

Another standard statistical measure is the standard deviation. A good description about how to calculate it can be found for example here. We start again with the class dbp:Athlete and calculate the ages they reached (this time for the entire class dbp:Athlete not its sub classes). Another thing we need are the squares of the ages that we calculate with “(?age * ?age as ?ageSquare)”. At the next stage we count the number of athletes in the result, and calculate the average age, the square of the sums and the sum of the squares. With those values we can calculate in the next step the standard deviation of the ages in our data set. Note that SPARQL does not specify a function for calculating square roots but RDF stores like Virtuoso (that hosts the DBpedia data) provide additional functions like bif:sqrt for calculating the square root of a value.

prefix dbp:<http://dbpedia.org/ontology/>
select ?count ?ageAvg (bif:sqrt((?ageSquareSum - (strdt(?ageSumSquare,xsd:double) / ?count)) / (?count - 1)) as ?standDev)
where {
{
select (count(?athlete) as ?count) (avg(?age) as ?ageAvg) (sum(?age) * sum(?age) as ?ageSumSquare) (sum(?ageSquare) as ?ageSquareSum)
where {
{
select ?subOfAthlete ?athlete ?age (?age * ?age as ?ageSquare)
where {
filter(?age >= 20 && ?age <= 100) .
{
select distinct ?subOfAthlete ?athlete (?deathYear - ?birthYear as ?age)
where {
?subOfAthlete rdfs:subClassOf dbp:Athlete .
?athlete a ?subOfAthlete .
?athlete dbp:birthDate ?birth filter(datatype(?birth) = xsd:date) .
?athlete dbp:deathDate ?death filter(datatype(?death) = xsd:date) .
bind (strdt(replace(?birth,"^(\\d+)-.*","$1"),xsd:integer) as ?birthYear) .
bind (strdt(replace(?death,"^(\\d+)-.*","$1"),xsd:integer) as ?deathYear) .
}
}
}
}
}
}
}

 

count ageAvg standDev 38542 66.876290799647138 17.6479

These examples show that SPARQL is quite powerful and a lot more than “just” a query language for RDF data but that it is possible to implement basic statistical methods directly at the level of the triple store without the need to extract the data and import it into another tool.

Categories: Blogroll

Web 2: But Wait, There's More (And More....) - Best Program Ever. Period.

Searchblog - Thu, 2011-10-13 12:20
I appreciate all you Searchblog readers out there who are getting tired of my relentless Web 2 Summit postings. And I know I said my post about Reid Hoffman was the last of its kind. And it was, sort of. Truth is, there are a number of other interviews happening... (Go to Searchblog Main)
Categories: Blogroll

Help Me Interview Reid Hoffman, Founder, LinkedIn (And Win Free Tix to Web 2)

Searchblog - Wed, 2011-10-12 11:22
Our final interview at Web 2 is Reid Hoffman, co-founder of LinkedIn and legendary Valley investor. Hoffman is now at Greylock Partners, but his investment roots go way back. A founding board member of PayPal, Hoffman has invested in Facebook, Flickr, Ning, Zynga, and many more. As he wears (at... (Go to Searchblog Main)
Categories: Blogroll

Help Me Interview the Founders of Quora (And Win Free Tix to Web 2)

Searchblog - Tue, 2011-10-11 12:54
Next up on the list of interesting folks I'm speaking with at Web 2 are Charlie Cheever and Adam D'Angelo, the founders of Quora. Cheever and D'Angelo enjoy (or suffer from) Facebook alumni pixie dust - they left the social giant to create Quora in 2009. It grew quickly after... (Go to Searchblog Main)
Categories: Blogroll

Help Me Interview Ross Levinsohn, EVP, Yahoo (And Win Free Tix to Web 2)

Searchblog - Tue, 2011-10-11 11:46
Perhaps no man is braver than Ross Levinsohn, at least at Web 2. First of all, he's the top North American executive at a long-besieged and currently leaderless company, and second because he has not backed out of our conversation on Day One (this coming Monday). I spoke to Ross... (Go to Searchblog Main)
Categories: Blogroll

I Just Made a City...

Searchblog - Mon, 2011-10-10 13:41
...on the Web 2 Summit "Data Frame" map. It's kind of fun to think about your company (or any company) as a compendium of various data assets. We've added a "build your own city" feature to the map, and while there are a couple bugs to fix (I'd like... (Go to Searchblog Main)
Categories: Blogroll

Help Me Interview Vic Gundotra, SVP, Google (And Win Free Tix to Web 2)

Searchblog - Mon, 2011-10-10 13:03
Next up on Day 3 of Web 2 is Vic Gundotra, the man responsible for what Google CEO Larry Page calls the most exciting and important project at this company: Google+. It's been a long, long time since I've heard as varied a set of responses to any Google project... (Go to Searchblog Main)
Categories: Blogroll

Help Me Interview James Gleick, Author, The Information (And Win Free Tix to Web 2)

Searchblog - Sat, 2011-10-08 20:16
Day Three kicks off with James Gleick, the man who has written the book of the year, at least if you are a fan of our conference theme. As I wrote in my review of "The Information," Gleick's book tells the story of how, over the past five thousand or... (Go to Searchblog Main)
Categories: Blogroll

I Wish "Tapestry" Existed

Searchblog - Fri, 2011-10-07 14:34
(image) Early this year I wrote File Under: Metaservices, The Rise Of, in which I described a problem that has burdened the web forever, but to my mind is getting worse and worse. The crux: "...heavy users of the web depend on scores - sometimes hundreds - of services,... (Go to Searchblog Main)
Categories: Blogroll

Help Me Interview Steve Ballmer, CEO of Microsoft (And Win Free Tix to Web 2)

Searchblog - Fri, 2011-10-07 12:17
Day Two at Web 2 Summit ends with my interview of Steve Ballmer. Now, the last one, some four years ago, had quite a funny moment. I asked Steve about how he intends to compete with Google on search. It's worth watching. He kind of turns purple. And not... (Go to Searchblog Main)
Categories: Blogroll

Me, On The Book And More

Searchblog - Thu, 2011-10-06 12:05
Thanks to Brian Solis for taking the time to sit down with me and talk both specifically about my upcoming book, as well as many general topics.... (Go to Searchblog Main)
Categories: Blogroll
Syndicate content