US9436810B2 - Determination of copied content, including attribution - Google Patents
- ️Tue Sep 06 2016
US9436810B2 - Determination of copied content, including attribution - Google Patents
Determination of copied content, including attribution Download PDFInfo
-
Publication number
- US9436810B2 US9436810B2 US14/541,422 US201414541422A US9436810B2 US 9436810 B2 US9436810 B2 US 9436810B2 US 201414541422 A US201414541422 A US 201414541422A US 9436810 B2 US9436810 B2 US 9436810B2 Authority
- US
- United States Prior art keywords
- content
- owner
- received
- controlled
- received content Prior art date
- 2006-08-29 Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000012552 review Methods 0.000 claims abstract description 9
- 238000012545 processing Methods 0.000 claims description 16
- 238000000034 method Methods 0.000 description 102
- 230000008569 process Effects 0.000 description 67
- 238000012544 monitoring process Methods 0.000 description 60
- 230000004044 response Effects 0.000 description 15
- 238000004458 analytical method Methods 0.000 description 12
- 238000011156 evaluation Methods 0.000 description 12
- 238000000605 extraction Methods 0.000 description 8
- 238000004891 communication Methods 0.000 description 7
- 238000012795 verification Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 6
- 230000006870 function Effects 0.000 description 6
- 230000000007 visual effect Effects 0.000 description 5
- 230000008901 benefit Effects 0.000 description 4
- 238000001514 detection method Methods 0.000 description 4
- 230000009193 crawling Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000007726 management method Methods 0.000 description 3
- 238000007670 refining Methods 0.000 description 3
- 230000009471 action Effects 0.000 description 2
- 238000004873 anchoring Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000003595 spectral effect Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 230000009118 appropriate response Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 208000018375 cerebral sinovenous thrombosis Diseases 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 238000013497 data interchange Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000003252 repetitive effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000010183 spectrum analysis Methods 0.000 description 1
- 239000004557 technical material Substances 0.000 description 1
- 230000036962 time dependent Effects 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/10—Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B20/00—Signal processing not specific to the method of recording or reproducing; Circuits therefor
- G11B20/00086—Circuits for prevention of unauthorised reproduction or copying, e.g. piracy
- G11B20/00166—Circuits for prevention of unauthorised reproduction or copying, e.g. piracy involving measures which result in a restriction to authorised contents recorded on or reproduced from a record carrier, e.g. music or software
- G11B20/00173—Circuits for prevention of unauthorised reproduction or copying, e.g. piracy involving measures which result in a restriction to authorised contents recorded on or reproduced from a record carrier, e.g. music or software wherein the origin of the content is checked, e.g. determining whether the content has originally been retrieved from a legal disc copy or another trusted source
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9536—Search customisation based on social or collaborative filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9538—Presentation of query results
-
- G06F17/30864—
-
- G06F17/30867—
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/02—Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
- H04L63/0227—Filtering policies
- H04L63/0263—Rule management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0241—Advertisements
- G06Q30/0247—Calculate past, present or future revenues
Definitions
- Content such as text, images, and video
- an online service provider such as Google or YouTube
- Non-compliant content may include material that violates third party copyrights or trademarks, is illegal (e.g., child pornography), or otherwise does not comply with a content owner's terms of use or with an OSP policy.
- Examples of potentially non-compliant use of content include bloggers copying text from news reports, eBay sellers copying other seller's listing content, aggregators republishing listings from other sites, spammers using copyrighted text to create web pages to influence search results and generate advertising revenue, or even innocent/accidental use of non-compliant content by a conscientious consumer.
- Content on the Internet is difficult to monitor for compliance.
- a content owner manually monitors the Internet for copies of the owner's content through repetitive queries in search engines like Google.
- the use of the owner's content is permissible under their own license terms or under legal principles such as the copyright concept of “fair use,” which considers such factors as whether attribution has been provided, what portion of the content has been used without permission, and whether the content has been used for commercial purposes (such as generating advertising or subscription revenue).
- Content owners have no automated methods to evaluate the context in which their content is used by others.
- the content owner's objective usually is to cause the content to be removed from third-party services that host the content or search engines which refer users to it through their indices.
- DMCA Digital Millennium Copyright Act
- the DMCA provides OSPs and search engines with a safe harbor from copyright infringement liability if they promptly remove content from their service upon request by the content owner. Therefore, when a content owner finds a copy of his content, he can choose to send a take down notice under DMCA by writing a letter or an email to the OSP or search engine. In response, the OSP or search engine typically must manually remove the content from their service to avoid liability.
- monitoring for content that does not comply with the OSP's host policy is also typically a manual process.
- OSPs monitor content as it is uploaded, typically a human views and approves content before (or after) it is displayed and non-compliant content is rejected (or removed).
- OSPs also must manually review and compare content when they receive DMCA notices, and often have little information to determine if content is out of compliance and no automated way to determine the identity or reputation of the complaining party.
- manual content monitoring and enforcement processes are becoming increasingly impractical. Therefore, improved methods for monitoring content and managing enforcement of non-compliant content are needed.
- FIG. 1 is a block diagram illustrating an embodiment of a content monitoring system.
- FIG. 2A is a flow chart illustrating an embodiment of a process for monitoring content.
- FIG. 2B is a flow chart illustrating an embodiment of a process for monitoring for use of controlled content.
- FIG. 2C is a flow chart illustrating an embodiment of a process for evaluating context of a content object.
- FIG. 2D is a flow chart illustrating an embodiment of a process for monitoring for use of controlled content.
- FIG. 2E is a flow chart illustrating an embodiment of a process for engaging with a user of non-compliant content.
- FIG. 2F is a flow chart illustrating an embodiment of a process for displaying compliance information.
- FIG. 3 is an example of a graphical user interface (GUI) for providing controlled content.
- GUI graphical user interface
- FIG. 4A is an example of a GUI for providing controlled content.
- FIG. 4B is an example of a GUI for providing usage rules.
- FIG. 5 is an example of a GUI for displaying search results.
- FIG. 6 is an example of a GUI for displaying use of a content object.
- FIG. 7 is a block diagram illustrating an embodiment of a system for making a determination of originality of content.
- FIG. 8 is a flowchart illustrating an embodiment of a process for performing an originality determination.
- FIG. 9 is a flowchart illustrating an embodiment of a process for making an originality determination.
- FIG. 10 is a flowchart illustrating an embodiment of a process for computing an originality score for a content object.
- FIG. 11 is a flowchart illustrating an embodiment of a process for analyzing originality factors related to the host and/or claimed owner of a content object.
- FIG. 12 is a flowchart illustrating an embodiment of a process for analyzing originality factors.
- FIG. 13 is a block diagram illustrating an example of originality factors related to the reputation of a host.
- FIG. 14 is a flowchart illustrating an example usage of a system for determining originality.
- the invention can be implemented in numerous ways, including as a process, an apparatus, a system, a composition of matter, a computer readable medium such as a computer readable storage medium or a computer network wherein program instructions are sent over optical or communication links.
- these implementations, or any other form that the invention may take, may be referred to as techniques.
- a component such as a processor or a memory described as being configured to perform a task includes both a general component that is temporarily configured to perform the task at a given time or a specific component that is manufactured to perform the task.
- the order of the steps of disclosed processes may be altered within the scope of the invention.
- FIG. 1 is a block diagram illustrating an embodiment of a content monitoring system.
- content monitoring system 100 is used by a content owner to monitor for non-compliant use of the content owner's content based on usage rules specified by the content owner.
- content owners include: a photographer (e.g., Ansel Adams), a film studio (e.g., Disney), or a columnist (e.g., Walter Mossberg), or a media outlet (e.g., The Wall Street Journal).
- the content owner is not necessarily the same as the content creator.
- Usage rules are a set of rules regarding conditions under which content may be used, as specified by the content owner. Usage rules may vary depending on the content and/or the content owner and applicable law (such as “fair use”). Usage rules are more fully described below.
- content monitoring system 100 is used by a content host to monitor for non-compliant use of content based on a host policy specified by the content host.
- a content host refers to an entity that hosts, serves, stores, provides, and/or displays content. Examples of content hosts include OSPs, such as search engines (e.g., Google), photo or video sharing websites (e.g., YouTube, Yahoo), and blogging sites (e.g., TypePad).
- OSPs such as search engines (e.g., Google), photo or video sharing websites (e.g., YouTube, Yahoo), and blogging sites (e.g., TypePad).
- an OSP is an entity that hosts and/or serves or provides content on behalf of itself or other entities.
- an OSP includes an OSP as defined under DMCA.
- An OSP includes an electronic content management system (ECM).
- ECM electronic content management system
- a host policy is a set of rules regarding conditions under which content may be hosted, as specified by a content host.
- a host policy may vary depending on the content host.
- OSPs may have policies that apply to the posting of content by their users, in which they reserve the right to remove content or users in the event of non-compliance (determined at their discretion).
- a configurable host policy governs the automatic handling of DMCA notices, as more fully described below.
- a content user includes an entity that uses content that is not owned by the content user.
- a content user includes an entity that owns or posts content. Examples of content users include writers, photographers, bloggers, or any user who posts content on content hosts.
- Controlled content refers to content associated with one or more compliance rules, where compliance rules include usage rules specified by a content owner and/or host policy rules specified by a content host.
- compliance rules include usage rules specified by a content owner and/or host policy rules specified by a content host.
- controlled content is the content owner's content.
- controlled content is content that is non-compliant with the host policy.
- Monitored content refers to the set of content being searched (i.e., potential matches). In other words, content monitoring system 100 searches monitored content for use of controlled content.
- a match, copy, or use of controlled content does not necessarily refer to an identical match, an identical copy, or use of identical content.
- a match, copy, or use of controlled content is identified based on criteria such as similarity scores and non-compliance scores, as more fully described below.
- Compliant content refers to content that satisfies usage rules associated with the content.
- compliant content refers to content that not only satisfies the usage rules, but also satisfies the host policy of the content host (e.g., the OSP).
- Content objects can include any object type. Examples of content objects include a text document, an image, video, audio, flash, animation, game, lyrics, code, or portions thereof (e.g., a phrase/sentence/paragraph, a subimage, or a video clip). Other examples include a single file (e.g., an image), all of the text on a web page (e.g., a news article), a chapter in a book, and a blog entry.
- the content object may be in various audio, image, or video formats, such as MP3, JPEG, MPEG, etc.
- Content monitoring system 100 can be used to find copies of a set of content at a given point in time or regularly monitor for matches.
- Content monitoring system 100 may be used to monitor data associated with the Internet or any other appropriate environment in which there is a need to monitor content for compliance. Examples of appropriate environments include the Internet, an Intranet, a firewalled network, a private network, an Electronic Data Interchange (EDI) network, an ad hoc network, etc.
- EDI Electronic Data Interchange
- user 102 provides input to ingestor 104 .
- Ingestor 104 provides input to subscriber database 105 , content database 108 , and crawler 112 .
- Reporter 110 receives input from subscriber database 105 and content database 108 .
- Crawler 112 provides input to digester 114 .
- Digester 114 provides input to content database 108 , controlled content store 116 , and monitored content store 118 .
- Matching engine 120 provides input to controlled content store 116 and monitored content store 118 .
- Content database 108 interacts with matching engine 120 .
- Content ingestor 104 accepts controlled content from user 102 .
- User 102 includes content owners or administrators of content monitoring system 100 .
- the content may be specified in various ways.
- a user interface (UI) may be provided for user 102 to specify content.
- the UI provides an interface for uploading content or specifying a link/set of links to the content, where the links may be local (e.g., on a local hard drive) or remote (e.g., on a remote server or on the Internet).
- An example of a remote link is a user's eBay account.
- User 102 may display, in his eBay store, images to be monitored. For example, user 102 is a photographer selling his photography.
- user 102 specifies a URL to the eBay store or particular auction.
- the content owner instead of providing a URL to a particular auction, the content owner provides their username (such as an eBay seller ID), which allows the system to retrieve all of the user-posted content associated with that username, which could be associated with one or more auctions.
- the content owner also provides a password if necessary or expedient to locate user-posted content.
- a schedule for fetching content may be specified. For example, crawler 112 may be configured to fetch images from the user's eBay store every 24 hours. The raw content is passed to digester 114 for processing and storage.
- the ingesting of content is automatically triggered by content creation. For example, when a blogger posts a new entry, it is automatically ingested. When a writer updates a Word document, the content is automatically ingested.
- the user is presented with a means to exclude or include specific content objects (such as a single image) from monitoring and from the content owner's usage rules.
- the controlled content may be from the Internet or from another source.
- a manual or automated API may be used to ingest content or perform any of the other processes described herein.
- a URL or any other appropriate identifier may be used to specify content. Credentials associated with accessing the content, such as a password, may be provided.
- content monitoring system 100 may be provided as input to content monitoring system 100 , such as links (e.g., URLs or websites) identified by an administrator, content host, or content owner. These sites may have been identified because the user is aware of a specific instance of non-compliance at that location, they have historically posted non-compliant content or are of particular concern to the user. Other examples of additional data that may be input to content monitoring system 100 are more fully described below.
- links e.g., URLs or websites
- Crawler 112 fetches content from the network.
- the content to be fetched may include the Internet, a subset of the Internet, a complete domain, or a single piece of content from the web.
- Identifiers may be used to identify the content to be fetched. Some examples of identifiers include: a URL, a directory, a password protected website(s), all items for a seller on eBay, and all content of a given type or format (e.g., images only or JPEGs only).
- crawler 112 is used with modules that provide different rules for crawling. In some embodiments, crawler 112 fetches content according to a specified schedule.
- Controlled content store 116 includes controlled content.
- controlled content store 116 includes the following information: a copy of the content, an index of fingerprints associated with the content, and metadata about the content (e.g., filename, URL, fetch date, etc.).
- the copy of the content is stored in a separate cache.
- a fingerprint includes a signature of an object that can be used to detect a copy of an object as a whole or in part.
- a content object may have more than one fingerprint.
- a fingerprint may be associated with more than one content object.
- a fingerprint may be associated with a whole or part of a content object.
- a fingerprint may be multidimensional. For example, there may be multiple features associated with a fingerprint.
- a fingerprint may contain multiple fingerprints or subfingerprints.
- Monitored content store 118 is a repository for crawled data. Monitored content store 118 may include any digital object collection or environment. In some embodiments, monitored content store 118 is a web store. In some embodiments, there are multiple content stores, e.g., one for each kind of data—text, images, audio, video, etc. In some embodiments, monitored content store 118 includes data from sites that copy the most often, and is updated most frequently. This data may be indicated as such (i.e., tagged or flagged as common copier) or stored separately.
- a real-time store (not shown) is used to store various feeds coming in (e.g., from a content owner's blog each time the blog is updated, or from a content owner's eBay store every 24 hours).
- a ping server or similar server is used to update feeds coming in. If the feeds contain links, the content is fetched by crawler 112 . Over time, data moves from the real-time store to monitored content store 118 as it becomes older. Monitored content store 118 changes periodically, whereas the real-time store keeps changing as content comes in.
- external stores (not shown), such as search engines, are accessed using application programming interfaces (APIs).
- APIs application programming interfaces
- data is fetched, they are stored in monitored content store 118 . Some embodiments of this are more fully described below.
- fingerprints of content are stored in monitored content store 118 .
- Gigablast is used to fetch and store content data.
- Digester 114 receives content fetched by crawler 112 , including controlled content or monitored content, analyzes, and processes it. Analysis of content is more fully described below.
- the content and associated metadata is stored in controlled content store 116 or monitored content store 118 , as described above.
- matching engine 120 finds matches to controlled content by comparing controlled content from controlled content store 116 with monitored content from monitored content store 118 based on matching techniques including technical factors, compliance factors, and other factors, as more fully detailed below.
- Reporter 110 reports match results to user 102 or an administrator of content monitoring system 100 .
- Various user interfaces may be used. Examples of reporting and UIs for reporting results are more fully described below.
- Subscriber database 106 contains information about customers.
- Content database 108 contains references to controlled content and to matched content corresponding to the controlled content. In some embodiments, a separate database is used for matched content.
- content monitoring system 100 is used as a content clearinghouse by content users wishing to use content. Before using a particular content object (i.e., unit of content), the content user checks with content monitoring system 100 to determine whether the conditions under which the content user wishes to the use the content complies with the usage policy set by the content owner.
- a particular content object i.e., unit of content
- Content monitoring system 100 may be implemented in various ways in various embodiments.
- controlled content, web data, subscriber data, and/or content data may be organized and stored in one or more databases.
- Ingesting, crawling, digesting, matching, and/or reporting may be performed using one or more processing engines.
- any of the functions provided by content monitoring system 100 may be provided as a web service.
- content monitoring system 100 or an element of content monitoring system 100 is queried and provides information via XML.
- FIG. 2A is a flow chart illustrating an embodiment of a process for monitoring content. In some embodiments, this process is performed when a content owner is searching or monitoring for non-compliant use of the owner's controlled content. In some embodiments, this process is performed by content monitoring system 100 .
- Controlled content may include text, images, video, or any other type of data. Controlled content may be specified in various ways, such as content located in a particular directory and/or all content contributed by a particular user (e.g., on eBay).
- a user e.g., a content owner or an administrator
- the user may also request a one time search or regular monitoring for the controlled content. In the case of the latter, the user may specify options related to regular monitoring, such as frequency of monitoring, how often reports should be received, etc.
- usage rules include conditions under which a content owner permits the use of owned content.
- Usage rules may include terms under which a content owner permits the republication and/or modification of content.
- Usage rules may include different conditions depending on whether the use is for commercial or non-commercial uses, business or education uses, with or without attribution, in a limited amount, in a limited context, etc.
- the usage rules may be based on any appropriate compliance structure, such as “fair use,” “copy left,” “share alike,” Creative Commons specified structures, user specific compliance rules, rules against associating the controlled content with objectionable content (e.g., obscenity, adult content, child pornography), rules requiring attribution, moral rights, rights of personality, or any legal or personal compliance structure.
- a usage rule may take into account editorial context. In other words, certain uses may be permitted that are not permitted in another context. For example, if the controlled content object is a book, portions from the book may be permitted to be used in a book review but not in another context (where other rules may apply).
- a variety of user interfaces may be used to specify usage rules. For example, a list of terms, checkboxes (to apply a rule), and settings (specific to a rule) may be provided. The list may include, for example: whether attribution is required, amount of duplication allowed, whether commercial use is allowed, whether changes are allowed, whether permission is required, whether derivative content is allowed, geographical requirements, whether the owner requires advertisement revenue sharing (e.g., using Google AdSense) and associated terms and information, etc.
- the usage rules may be hierarchical. For example, a list of higher level rules or compliance structures may be displayed for selection, each of which may be expanded to display lower level rules that each of the high level rules comprises. Usage rules may have any number of levels.
- Checkboxes may be located next to the higher level or lower level rules and may be selected (e.g., checked off) at any level of granularity. For example, selecting checkboxes next to a higher level rule automatically selects all corresponding lower level rules. Alternatively, lower level rules may be individually selected.
- An example of a higher level rule is a particular type of license. Lower level rules under the license include the specific usage rules associated with the license.
- Usage rules may be customized for each content owner (and for each content object).
- a unique URL is provided to the content owner for his use (e.g., to include as a link associated with an icon placed in proximity to his content on his website, in his eBay store, etc.)
- the content user can then select the link, which leads to a page describing the content owner's usage rules (for that content object).
- the content owner could use a particular URL on his website or web page.
- the particular URL could be “rules.attributor.com.”
- the content user can select the link, which leads to a page describing the content owner's usage rules (for the website or content on the website).
- the content monitoring system determines from which website the link was selected and can determine which usage rules to display.
- the same URL is common to multiple content owner's websites. Further examples are discussed below.
- Usage rules may be stored in the content monitoring system.
- the usage rules for content owners may be stored in controlled content store 116 (e.g., as metadata associated with the content object) or in subscriber database 106 .
- controlled content is acquired.
- 204 is performed by ingestor 104 in system 100 .
- controlled content is obtained from a source specified at 202 .
- controlled content is obtained from a particular directory or from one or more servers containing content contributed by a particular user.
- Controlled content acquisition may be automated or non-automated.
- an automated process could poll for updates and acquire controlled content when an update is detected.
- a ping server is used to detect updates.
- controlled content is continuously acquired or ingested. For example, if the controlled content is specified as all content contributed by a particular user on eBay, then when the user contributes new content to eBay, that content is automatically acquired or acquired at configured times or time intervals.
- a variety of APIs may be used to acquire controlled content.
- the user is given an opportunity to confirm that it is the correct controlled content or the controlled content the user intended.
- the acquisition of controlled content may involve any network, protocol (e.g., UDP, TCP/IP), firewall, etc.
- controlled content is analyzed.
- 206 is performed by digester 114 in system 100 .
- the acquired content is analyzed for unique identifying features. Any appropriate technique may be used to extract features from the content. For example, a fingerprint associated with the content may be determined. The technique may depend on the media type (e.g., spectral analysis for audio/video, histogram or wavelets for images/video, etc.) For example, in the case of text content, various techniques may be used, such as unique phrase extraction, word histograms, text fingerprinting, etc. An example is described in T. Hoad and J.
- a signature is formed for each clip by selecting a small number of its frames that are most similar to a set of random seed images, as further described in S.-C. Cheung, A. Zakhor, “Efficient Video Similarity Measurement with Video Signature,” Submitted to IEEE Trans. on CSVT, January, 2002.
- an audio fingerprinting technology may be used.
- a spectral signature is obtained and used as input to a hash function.
- Analyzing may include determining spectral data, wavelet, key point identification, or feature extraction associated with the controlled content.
- results from the analysis are stored in controlled content store 116 in system 100 .
- monitored content is searched for use of controlled content.
- monitored content is specified by a user, such as a content owner or administrator.
- the entire web may be searched, or a subset of the web (e.g., websites that have been identified as sites that copy the most often or data in a content store such as monitored content store 118 ).
- a database of sites that have been crawled and resulting data may be maintained that is updated at various times. Rather than searching the entire web, the database may be used instead. Searching may comprise a combination of searching the web and consulting a database of previously crawled websites.
- monitored content store 118 in system 100 stores previously crawled websites.
- 208 is performed by crawler 112 in system 100 .
- Searching may be performed in one or more stages, each stage refining the search further.
- a first search may yield a first set of candidate content objects.
- a second search searches the first set of candidate content objects to yield a second set of content objects, and so forth.
- the final set of content object(s) includes the content object(s) that match or most closely match the controlled content object.
- less expensive and/or less complex techniques may be used to obtain candidate sets followed by one or more tighter, smaller granularity techniques to progressively enhance the resolution of the analysis. Which techniques may be used and in which order may be determined based on cost and/or complexity.
- the second search comprises a manual search.
- the second set of content objects may be a smaller set and may be searched by a human.
- a hash structure is used to obtain a candidate set of content objects.
- a hash table is maintained such that similar content objects are hashed to the same or a nearby location in a hash table.
- search for content object A a hash function associated with A is computed and looked up in a hash table, and a set of objects that are similar to A is obtained.
- a hash function associated with a content object may be computed in various ways. The hash function may be computed differently depending on the type of content object or one or more characteristics of the content object. For example, if the content object is a text document, a fingerprinting technique specific to text may be used to obtain a fingerprint of the document.
- the fingerprint may be input to the hash function to obtain a hash value that corresponds to a group of other content objects that have a similar fingerprint.
- Hash values that are nearby in the hash table correspond to content objects that have similar (though less similar than those in the same hash bin) fingerprints, to create a clustering effect. In this way, a candidate set of content objects may be obtained.
- existing search engines or search facilities on websites are used to obtain a candidate set of documents.
- This approach may be useful in an initial implementation of the system.
- APIs provided by Google or other search engines may be used to perform this search.
- To search for a document a unique phrase within the document is selected.
- the unique phrase is input to a Google search using a Google API and the results are a candidate set of documents.
- Multimedia search engines e.g., video, image
- an image search engine may be used to obtain a candidate set of images.
- Riya www.Riya.com
- Riya includes an image search engine that may be used to obtain a candidate set.
- databases may be searched using these techniques.
- Some examples of databases include Factiva, Corbis, and Hoover's. Although these databases do not allow indexing of their documents, they do have a search interface. This search interface may be used to perform searches for content using unique phrase extraction. For example, articles in the Factiva database containing a unique phrase from a controlled content object are more likely to be a match. A subsequent search may be performed by obtaining the full text of the articles and searching them using more refined techniques. Searching this way limits having to crawl the entire Internet. Also the more computationally intensive search techniques are limited to a smaller search space.
- one or more refining searches are performed.
- the candidate set of documents are crawled and advanced matching techniques can be applied to the candidate set of documents.
- advanced matching techniques may be used.
- the techniques described at 206 may be used on the candidate set of content objects.
- a refining search may comprise computing a signature for each paragraph or other data set.
- a Levinstein distance could be used to determine the similarity between a document and the controlled content object.
- a byte by byte comparison could be used.
- Other techniques, such as anchoring or cosine similarity may be used, as described more fully in T. Hoad and J. Zobel, “Methods for identifying versioned and plagiarized documents,” in Journal of the American Society for Information Science and Technology, Volume 54, Issue 3, 2003. Techniques such as PCA-sift or feature extraction of color, texture and signature generation may be used.
- PCA-sift or feature extraction of color, texture and signature generation may be used.
- A. C. Colombia and H. Farid “Exposing Digital Forgeries by Detecting Duplicated Image Regions, Technical Report, TR2004-515, Dartmouth College, Computer Science describes examples of such techniques.
- images may be subsampled to be robust against cropping and subimage reuse using techniques such as key pointing (or key point extraction), which looks for unique signatures within a portion of an image, such as edges or extreme color gradations, and samples these portions to obtain a signature.
- key pointing or key point extraction
- Another way is to subsample distinctive portions of a color histogram of the image.
- different techniques are used depending on characteristics of the content object. For example, if a document has fewer than 20 paragraphs, a byte by byte comparison may be used. If a document has 20 or more paragraphs, a different technique may be used. Sampling and anchoring points may depend on the format of the document.
- 210 use of controlled content is detected.
- 210 - 213 are performed by matching engine 110 in system 100 .
- detection is based on various criteria associated with technical factors that may result from searching at 208 .
- An example of a technical factor is a similarity score.
- a similarity score is a measure of the similarity between two content objects and may be computed in a variety of ways. For example, the Levinstein distance is a similarity score.
- use of controlled content is detected. The criteria may be configurable by the user or administrator.
- One or more similarity scores may be computed for a controlled object and candidate object to represent various characteristics of the content.
- one or more similarity scores may be weighted and combined into a single similarity score.
- a similarity score may account for various degrees of copying. For example, the first and last paragraph of a document may be copied, a portion of a document may be copied, or the whole document may be copied. Different samples of music may be copied into a single audio file. Videos may be mixed from copied videos. One controlled document may have 15 samples, one or more of which may be copied. A similarity score may account for these factors. For example, a copying extent score may be used to indicate the percentage of a controlled content object that has been copied. A copying density score may be used to indicate the percentage of a match that is comprised of a controlled content object.
- a context associated with the use of the controlled content is evaluated.
- the context refers to any attribute associated with the use of the content object.
- the context includes compliance factors, technical factors, and reputation information.
- Context may be automatically and/or manually determined.
- Compliance factors are based on usage rules specified by content owners. For example, compliance factors include information related to attribution and commercial context. Examples of compliance factors include whether the site is government, education, commercial, revenue producing, subscription based, advertising supported, or produces revenue in some other way (e.g., using a reputation bartering scheme associated with a compensation mechanism). This can be determined manually or automatically. For example, a human could review the website, or based on the top level domain (e.g., .edu, .com, .org), or the presence of advertising related HTML code, it can be determined whether the website is commercial.
- the top level domain e.g., .edu, .com, .org
- a non-compliance score is computed to represent the likelihood that a content object is non-compliant based on the compliance factors.
- multiple compliance factors are used to determine a non-compliance score.
- the non-compliance score takes multiple compliance factors, normalizes and weighs each one as appropriate, and takes the sum.
- the weighting is based on usage rules and/or host policy rules.
- an overall weight may be used to scale the non-compliance score. For example, content found on educational sites may be weighted differently.
- One or more non-compliance scores may be computed.
- reputation information examples include reputation information.
- a reputation database is maintained that includes reputation ratings of content users by other content owners.
- Bob's blog may have a low reputation because it has posted numerous copyrighted content objects owned by others who have given Bob's blog a low reputation rating.
- matching content i.e., match content object(s)
- a match, copy, or use of controlled content does not necessarily refer to an identical match, an identical copy, or use of identical content.
- a match is a technical match and is selected based only on technical factors, such as similarity scores.
- technical matches are identified at 210 , and at 212 , the technical matches are evaluated based on context to determine whether they are compliant.
- a match is selected based on configurable criteria associated with technical factors (e.g., similarity scores), compliance factors (e.g., non-compliance scores), and/or other factors (e.g., reputation information).
- technical factors e.g., similarity scores
- compliance factors e.g., non-compliance scores
- other factors e.g., reputation information.
- it is determined that content objects with one or more similarity scores that exceed a similarity score threshold and one or more non-compliance scores that exceed a non-compliance score threshold are matches. In other words, a content object that is technically similar, but is compliant with applicable usage rules, would not be considered a match.
- it is determined that any content object with one or more similarity scores that exceed a similarity score threshold is a match.
- a binary flagging is used. For example, it is determined that content objects with one or more similarity scores that exceed a similarity score threshold and/or one or more non-compliance scores that exceed a non-compliance score threshold are “interesting” and other content objects are “non-interesting.” In some embodiments, “interesting” content objects are reported to the user at 214 .
- content is reported to the user (e.g., content owner).
- which content to report is configurable and may depend on criteria based on technical factors (e.g., similarity scores), compliance factors (e.g., non-compliance scores), and/or other factors (e.g., reputation information).
- matching content as identified at 213 is reported to the user.
- a user views and manually confirms whether each matching content object is non-compliant. The results may be stored in a common database.
- 214 is performed by reporter 110 in system 100 .
- Various interfaces could be used. Screenshots, links, buttons, tabs, etc. may be organized in any appropriate fashion.
- a user interface is presented to the user that shows the matching content, one or more similarity scores, and one or more non-compliance scores. Example interfaces for reporting results are more fully described below.
- the interface provides a way for the user to confirm that content is the user's content or reject the content (i.e., indicate a false positive). This data may be fed back into the monitoring process. For example, this information may be stored in a database or with the content metadata.
- the interface provides choices of actions for the user to select from (e.g., ask that the reusing party attributes it, offer license/licensing terms, remove under DMCA, etc.).
- 214 is not performed and the process continues at 216 .
- user contact information is obtained from the IP address, the U.S. Copyright Office (e.g., a designated agent registered with the U.S. Copyright Office), or a known email address (e.g., of an OSP or a user of an OSP).
- U.S. Copyright Office e.g., a designated agent registered with the U.S. Copyright Office
- a known email address e.g., of an OSP or a user of an OSP.
- a database or lookup table of contact information associated with various sites may be maintained and used to determine user contact information.
- various types of communication may be sent to the content user.
- a DMCA notice, information concerning usage rules, licensing information, etc. may be sent.
- the content owner may have specified one or more usage rules associated with his content, such as “do not license any content,” “replace content with an advertisement,” “add watermark to content,” “add Unicode overlay,” “share advertisement revenue,” or “ask permission prior to use.” Based on the usage rules, an appropriate communication may be sent to the content user.
- the content user is also configured to use the content monitoring system.
- the content user may have specified a set of compliance rules, such as “automatically debit my account up to $100 per year when licensed content is used,” “offer to share advertising revenue when contacted by content owner,” “remove content when contacted by content owner,” etc. Based on the compliance rules, an appropriate response may be sent back to the content owner.
- an engagement communication may be configured to be sent in a way that preserves the anonymity of the sender of the engagement communication (e.g., the content owner, or a content host, as more fully described below).
- An example of an engagement communication includes an email that is automatically sent to a content user notifying the user that the content is owned and offering to license it for $9.99 per year, and including a link to the content owner's usage rules hosted by the content monitoring system.
- the content owner may configure his settings so that the email is not sent to content users whose sites are educational or non-profit or those settings may be default settings if the content owner's usage rules indicate free use by educational or non-profit sites.
- the content user sends a response agreeing to the terms.
- the response may be created and/or sent automatically because the content user's compliance rules indicate the following rule: “automatically debit my account up to $100 per year when licensed content is used.”
- the response may be sent manually, or the user may approve an automatically created response before it is sent.
- a series of communications may occur between the content user and content owner.
- the responses may be automatic. In this way, licensing terms can be negotiated and/or steps can be taken towards resolution.
- compensation is not necessarily monetary.
- the content owner may just want to receive attribution, license revenue or advertising revenue sharing may be donated to charitable or other causes as directed by the content owner or may be treated as a credit towards a trade (e.g., if you use my content, I can use your content), or the content owner may require that the content and derivative works be presented in a manner that enables tracking of the number of uses or views of the content, or that derivative works must be available for use by others under specified usage rules.
- processes 202 - 206 are performed.
- every prespecified search interval processes 208 - 213 are performed.
- every prespecified report interval 214 is performed. For example, an email may be sent to the user indicating that new matches have been found, and a link to the web interface provided in the email message.
- 214 is performed each time a user logs into the content monitoring system.
- 208 - 213 are performed when a user logs into the content monitoring system, either automatically, or after a user selects an “update results” or “search” button upon logging in.
- the number of accesses to a controlled content object is tracked.
- the content is associated with a web beacon or other element of code that enables the tracking of accesses of the content for purposes such as calculation of license fees or revenue sharing.
- FIG. 2B is a flow chart illustrating an embodiment of a process for monitoring for use of controlled content.
- this process is performed when a content host, such as an OSP, is searching or monitoring for non-compliant use of content based on a host policy of the content host.
- the controlled content in this case is non-compliant content based on a host policy.
- this process is performed by content monitoring system 100 .
- a host policy is specified.
- an OSP may have a policy regarding what comprises non-compliant content.
- Non-compliant content may include material that violates third party copyrights or trademarks, is illegal (e.g., child pornography) or does not comply with an OSP's terms of service (e.g., adult content, pornography, obscenity).
- a host policy may include host rules that may be associated with any compliance structure, such as host specific compliance rules, rules against objectionable content (e.g., obscenity, adult content, child pornography), or any legal or personal compliance structure.
- a host policy may specify that content must comply with usage rules specified by the content owner, such as “copy left,” “share alike,” Creative Commons specified structures, etc.
- a variety of user interfaces may be used to specify a host policy.
- any of the user interfaces described at 203 for specifying usage rules may be used to specify a host policy.
- a list of terms, checkboxes (to apply a rule), and settings (specific to a rule) may be provided.
- the list may include, for example: whether pornography is allowed, whether profanity is allowed, whether to comply with one or more usage rules, whether to comply with copyright or other legal structures, etc.
- the rules may be hierarchical. For example, a list of higher level rules or compliance structures may be displayed for selection, each of which may be expanded to display lower level rules that each of the high level rules comprises. Rules may have any number of levels. Checkboxes (or another appropriate object) may be located next to the higher level or lower level rules and may be selected (e.g., checked off) at any level of granularity.
- the monitored content comprises the content hosted by the content host (e.g., the content served by the OSP).
- monitoring comprises checking each content object before it is hosted (or served) by the OSP.
- an OSP such as youtube.com may check each video before it is made available for viewing on youtube.com.
- monitoring comprises periodically checking content objects served by the OSP. For example, a new video is made available for viewing immediately after being posted, but the video may later be removed by a monitoring process that checks new content objects. If the video is determined to be non-compliant, it is removed and the video owner is optionally notified. The results of the check are stored in a database so that the video does not need to be checked again unless it is modified.
- an evaluation is performed, where the evaluation can include techniques described at 212 .
- the evaluation may also include techniques used to detect objects or characteristics of objects in an image, such as faces, body parts, the age of a person being depicted, etc. Such techniques may be useful to detect pornography or child pornography, for example.
- the evaluation results may then be stored in the database.
- Examples of monitoring are more fully described below with respect to FIG. 2D .
- a common pool of objectionable content is maintained based on input from multiple content hosts.
- the common pool may include content that has been identified by various content hosts as containing pornography, child pornography, profanity, or racial content.
- an OSP may have an interest in contributing to, sharing, and using the common pool to identify objectionable content and remove or reject it.
- an OSP such as eBay may desire to monitor content posted by its users.
- An eBay employee manually performs simple filtering for adult content.
- Content in the objectionable database may also be stored with a certainty rating. For example, the greater number of times the content object has been identified as violating a rule, the greater the certainty rating.
- data is maintained regarding each usage/compliance rule that it violates.
- content object 10034 may be non-compliant with rules 4, 7, and 112, but not other rules. This information may be stored in a table, metadata associated with content object 10034, or in any other appropriate way.
- data from that process may be re-used at 232 .
- similarity, compliance, and other factors may be determined based on data already obtained at 202 - 213 . Additional compliance factors that take into account the host policy may also be determined and used.
- content is reported.
- content to report is configurable and may depend on criteria based on technical factors (e.g., similarity scores), compliance factors (e.g., non-compliance scores), and/or other factors (e.g., reputation information) as described at 214 .
- Content reported may include content determined to be non-compliant based on the host policy. Content reported may also include notices received from content owners who believe the content host is using their content in a non-compliant way.
- a web interface may be provided for viewing and managing reported content.
- the web interface allows the host to track and manage past and/or pending engagement notices.
- the web interface includes information about matching content, reputation information, similarity scores, non-compliance scores, link(s) to usage rules associated with the content object, and any other appropriate information.
- Reputation information could be related to the content owner, e.g., how reputable the content owner is. For example, the content owner may not actually be the content owner, but a scam artist or spammer who has sent thousands of notices. On the other hand, a reputable content owner may have only sent 3 notices in the past year.
- reputation is based on ratings by other content users, content hosts, and/or other users of the content monitoring system. For example, content users who have dealt with a particular content owner and felt that he was legitimate may have given him good reputation ratings.
- APIs to the content monitoring system are provided to the OSP for managing notices and responding.
- an automatic response is sent according to rules set by the OSP. For example, whenever the OSP receives a DMCA notice from a content owner with a reputation rating above a specified value, it automatically takes down the image. In another example, whenever a child pornography match is made with a similarity score above 90 and a non-compliance score above 80, an email is sent to the user and if no response is received within a set period of time, the content is removed. In some embodiments, an OSP administrator manually reviews each content match and selects a response for each content match.
- common pools of data include reputation of content owners, reputation of content users, reputation of content hosts, content known to be in the public domain, sites known to copy the most often, etc. These common pools may be contributed to by content owners (e.g., end users), content hosts (e.g., an employee on an OSP's content review team), legal experts, experts in “fair use,” other reputable entities, results from previous detection results (e.g., false positives), etc. APIs or other interfaces may be provided to assist with flagging content for inclusion in these pools. These common pools of data may then be accessed and used during the monitoring process (e.g., during 202 - 216 or 231 - 232 ).
- a negation database be maintained that includes content that is known to be in the public domain, content that has expired or lapsed in copyright, and/or content that is difficult to claim ownership of, e.g., because it is common, such as disclaimers and copyright notices. Any content in the negation database is designated as compliant.
- FIG. 2C is a flow chart illustrating an embodiment of a process for evaluating context of a content object.
- this process is used to perform 212 when the context includes compliance information (e.g., compliance factors). Examples of compliance factors include the presence or absence of advertising on a page containing the content object, whether the page contains paid content, etc.
- this process is performed by content monitoring system 100 . In some embodiments, this process is performed when a content owner is monitoring for use of his content.
- a detected content object associated with use of controlled content is obtained.
- the detected content object is detected based on technical factors, as described at 210 .
- usage rules associated with the controlled content are obtained.
- the usage rules specified by the content owner at 203 are obtained.
- a usage rule is evaluated against the detected content object.
- the usage rule may be specified at a high level (e.g., do not permit use on for profit sites, permit use on nonprofit sites) or at lower level (e.g., do not permit use on pages containing advertising, offer to license on pages containing paid content, permit use on sites ending with .edu). For example, it is determined whether the page associated with the content object contains advertising, requires a subscription, contains affiliate links, or contains paid content.
- the usage rule is satisfied. If not, one or more scores are adjusted. For example, a non-compliance score may be increased or decreased as appropriate.
- FIG. 2D is a flow chart illustrating an embodiment of a process for monitoring for use of controlled content.
- this process is performed by content monitoring system 100 .
- this process is performed when a content host, such as an OSP, is checking for non-compliant use of controlled content.
- this process may be used to perform 232 .
- a content object is received. For example, a user is posting a new content object to an OSP site, and the OSP is receiving the content object for the first time.
- a fingerprint of the content object is generated. A fingerprint may be generated, feature(s) may be extracted, or other analysis performed, as described at 206 .
- the fingerprint (or another analysis result) is checked against a database of known non-compliant (or known compliant) content objects.
- the database includes a common pool of content that has previously been identified either manually or automatically as non-compliant or compliant. The content can be looked up by fingerprint or any other appropriate index.
- the content object is removed at 272 . If it is not non-compliant according to the database, then the content object is evaluated at 268 . (In some embodiments, if the content is compliant according to the database, then the content object is approved for posting.) Evaluating may include any of the processes described at 212 - 213 and/or at 240 - 256 . In some embodiments, evaluating includes notifying the content host (e.g., the OSP) and receiving an evaluation of the content object from the content host. For example, the content host may perform a manual or automatic evaluation. The results of or data from the evaluation is stored in the database with the fingerprint.
- the content host e.g., the OSP
- the content object is determined whether the content object is non-compliant according to the evaluation. For example, the determination can be made based on technical factors, compliance factors, or other factors, as previously described. If the content object is non-compliant, the content object is removed at 272 . If not, the process ends. In some embodiments, if the content object is not non-compliant, then the content object is approved for posting.
- FIG. 2E is a flow chart illustrating an embodiment of a process for engaging with a user of non-compliant content.
- this process is performed by content monitoring system 100 .
- this process is used to perform 236 when non-compliant content is found. For example, this process may be performed in place of 272 .
- a content object is determined to be non-compliant. For example, the determination can be made based on technical factors, compliance factors, or other factors, as previously described.
- it is determined whether user contact is requested which may be a configurable setting.
- the user refers to the entity that posted the content on the OSP. If user contact is not requested, then the content object is removed. If user contact is requested, then the user is contacted at 284 . For example, the user is notified that the user's content has been identified as non-compliant content and to either take down the content, explain why the content is compliant, or cause the content to be compliant (e.g., based on usage rules for the content).
- the process ends. In some embodiments, if the content object is now in compliance a database is updated to include this information.
- FIG. 2F is a flow chart illustrating an embodiment of a process for displaying compliance information (e.g., rules) to a content user wishing to use content on a content owner's website (as described at 203 ).
- a content owner has created a web page of his content (e.g., “www.example.com”) and included on the web page a link that is associated with a server that stores compliance information associated with his content.
- the link is a common URL, where the common URL is not unique to the content owner or his web page (e.g., “rules.attributor.com”).
- the web page is viewed, e.g., by a potential content user.
- the “rules.attributor.com” link is selected. For example, the content user is interested in using the content, and would like to know if there are any usage rules associated with it.
- a receiving system receives the request for “rules.attributor.com” at 296 and determines the appropriate compliance information at 298 .
- the compliance information is determined by looking up the web page from which the link was selected (e.g., the content owner's web page) in a table (or other appropriate structure) of compliance information. For example, next to “www.example.com” in the table are usage rules associated with content on “www.example.com.”
- the table includes information about content objects on the web page and associated usage rules.
- the server retrieves the content on web page “www.example.com” and looks up associated compliance information based on the retrieved content information.
- each content object may have a content object ID or fingerprint that may be used to identify it and look up usage rules associated with it.
- both the URL “www.example.com” and information associated with the content object are used to obtain the compliance information.
- a web page with the compliance information is returned.
- the web page with the compliance information is viewed. For example, the potential content user views the compliance information and can decide whether to use the content.
- FIG. 3 is an example of a graphical user interface (GUI) for providing controlled content.
- GUI graphical user interface
- a user uses GUI 300 to specify content to be monitored at 202 .
- a user can enter a URL or a link to controlled content or upload a file. Any number of content objects can be specified.
- a username and password to access content can be provided.
- a user uses GUI 300 to specify input to ingestor 104 in FIG. 1 .
- GUI 300 and the other GUIs described herein may vary depending on the embodiment. Which functionality to include and how to present the functionality may vary. For example, which objects (e.g., text, links, input boxes, buttons, etc.) to include and where to place the objects may vary depending on the implementation.
- objects e.g., text, links, input boxes, buttons, etc.
- FIG. 4A is an example of a GUI for providing controlled content.
- GUI 400 opens in response to selecting a link in GUI 300 , such as the “Add Content” button.
- a user uses GUI 400 to specify content to be monitored at 202 .
- a user uses GUI 400 to specify input to ingestor 104 in FIG. 1 .
- one or more files may be provided in the “My content” input box.
- a user can indicate whether the content is a single web page or file or a URL or feed. In the case of the URL or feed, the content includes all existing content plus any new content added in the future.
- the “Nickname” input box the user can specify a nickname for the controlled content. In this way, a user can manage or maintain multiple sets of controlled content using different nicknames.
- a “Sites to watch” input box in which the user may enter URLs where the user expects the content to appear. For example, the user may currently be aware that a particular site is using the user's content.
- the content monitoring system searches the web, but searches the specified sites first or more frequently.
- a “Related Keywords” input box is shown, in which, the user may enter keywords associated with the specified controlled content. For example, if the user expects the content to be found primarily in children's websites, the keywords “kids” and “children” might be included.
- the content monitoring system automatically determines keywords (such as unique phrases) to search in addition to the related keywords specified by the user.
- a “Search Scope” input box is shown, in which the user may specify whether the entire Internet should be searched or only domains specified by the user. In some embodiments, the user may specify to only search sites that copy the most often.
- a “Text” input box in which text may be entered.
- the text may be text in the content itself or text associated with the content, such as keywords, tags, depictions of the text (e.g., a photo of a street sign with text), etc.
- search criteria may be specified, including a minimum similarity score, a minimum non-compliance score, a minimum percent of controlled content copied, a minimum percent of text copied, a minimum number of images copied, a minimum percent of match, whether the content is attributed (e.g., to the content owner), whether there is advertising on the page and what type, the minimum number of unique visitors per month, and what types of matches to find (e.g., images only, text only, video only, or combinations, etc.)
- FIG. 4B is an example of a GUI for providing usage rules.
- GUI 402 is included as part of GUI 400 .
- GUI 402 opens in response to selecting a link in GUI 400 , such as a “Specify Rules of Use” link (not shown in GUI 400 ).
- a user uses GUI 402 to specify usage rules associated with the content specified in GUI 400 .
- a user uses GUI 402 to specify usage rules at 203 .
- a list of usage rules may be selected by selecting bullets and checkboxes.
- the rules listed in this example include: attribution required/not required; commercial use OK, OK if user shares a specified percentage of the revenue, or no commercial use; limit text copies to a specified percentage of the source (controlled) content; no changes may be made to controlled content; contact content owner first for permission; share alike; a specified Creative Commons license; all rights reserved; or public domain.
- a similar GUI may be used to specify host rules for a host policy.
- FIG. 5 is an example of a GUI for displaying search results.
- GUI 500 is used to report search results at 214 , e.g., to a content owner.
- reporter 110 in FIG. 1 reports results using GUI 500 .
- a content owner owns a collection of photography related content, including images of cameras and text descriptions of cameras.
- the search results are shown in a grid based layout.
- a controlled content object and a match content object are shown, where it has been determined that the match content object is similar to the controlled content object based on a similarity score and a non-compliance score.
- the controlled image (camera1) and the match image (camera2) have a similarity score of 98 and a non-compliance score of 88.
- data displayed includes one or more of the following: similarity score, non-compliance score, URL of the match content object, percent of the controlled object copied, percent of the controlled text copied, the number of controlled images copied, the date found, whether there is advertising on the page, etc.
- similarity score percent of the controlled object copied
- percent of the controlled text copied percent of the controlled text copied
- the number of controlled images copied the date found, whether there is advertising on the page, etc.
- a portion of the copied text is displayed along with the controlled text in grid cell 504 .
- a binary flagging (e.g., “interesting” or not) is reported. For example, a score that aggregates similarity, non-compliance, and/or other factors into a combined/summary score may be displayed.
- the additional matched content objects are displayed using a 3D graphical effect indicating there are multiple pairs. Using forward and back arrows, the user can cycle through the various pairs. In some embodiments, the pairs are displayed in descending similarity score order.
- GUI 500 Various other functionality may be provided in GUI 500 .
- the search results may be filtered and sorted in various ways using the “Showing” and “Sort by” pull down menus. Additional controlled content may be added in the “Controlled Content” input box, an email address may be entered for automatic notification (e.g., when more matches are found) in the “Email Address” input box, etc. Rather than use a grid based layout, other layouts may be used in other embodiments.
- an interface similar to interface 500 may be used to display resulting matches.
- cell 502 may display a match with copyrighted content.
- Cell 504 may display a match with content associated with child pornography.
- in place of text1 may be a known image that has been positively identified (either manually or automatically) as child pornography, and in place of text2 may be a new image that is being posted by a user to the content host.
- the known image in place of text1 may have been in a database of known non-compliant content, and the match determined as described at 264 .
- the new image is determined to be a match with child pornography based on an evaluation (e.g., 268 ) rather than a match with a content object in a database of known pornography.
- an evaluation e.g., 268
- FIG. 6 is an example of a GUI for displaying use of a content object.
- GUI 600 is displayed in response to selecting a “Match” link or the image or text corresponding to a match object in GUI 500 .
- the portions of the web page that include use of the controlled content are marked, i.e., boxed (e.g., a graphical box around the image or text that is being used).
- text1, text3, and photo2 are controlled content objects that are being used on this web page.
- various indicators e.g., visual cues
- indicators include: highlighting text, changing font appearance (e.g., using bold, underline, different fonts or font sizes, etc.), using different colors, displaying icons or other graphics in the vicinity of the copied portions, using time dependent indicators, such as causing the copied portions to flash, etc.
- an archive date May 31, 2006
- Applicable usage rule(s) specified by the content owner may be displayed.
- the usage rules are displayed using the icons described with respect to FIG. 4B .
- details regarding the associated usage rule may be displayed.
- the web page shown is the current version of the web page.
- the web page shown is an archived version.
- the archived version may be stored in monitored content store 118 . Whether the web page is the current version or an archived version may be indicated in the GUI.
- the user may be able to toggle between the two versions.
- a management GUI may be provided for managing content that provides links to one or more of the GUIs described above.
- a user uses the management GUI to manage content, including add new controlled content, modify search parameters, report search results, etc.
- various tabs may be provided, such as a “My Content” tab used to add/modify controlled content and search parameters and a “Matches” tab used to display search results.
- selecting the “Matches” tab opens GUI 500 .
- a user can group content into categories, such as categories associated with the user's blog, camera reviews, the user's eBay account, and all files.
- content may be grouped in folders, by tags, or in any other appropriate way.
- a list of controlled content (e.g., URLs, paths) associated with the category may be displayed, including the number of content objects associated with the controlled content, when the content objects were last archived (e.g., placed in controlled content store 116 ), rules associated with each content object, and the number of matches found for the content object(s).
- FIG. 7 is a block diagram illustrating an embodiment of a system for making a determination of originality of content.
- System 700 provides a determination of originality for one or more content objects. In some embodiments, system 700 provides a determination of originality for all registered content objects. In some embodiments, system 700 provides a determination of originality for a content object in response to a request. For example, system 700 may be used in a content clearinghouse system.
- a determination of originality may be useful in a variety of applications, including the verification of originality for the purposes of determining the priority of listings in search and match results in search engines, licensing of the content, and participation of publishers in contextual advertising networks, as more fully described below.
- an original content object includes an instance of a content object that is available at a location (such as a URL) that is served by or on behalf of either the original author or creator of the content or by a party authorized to present the content.
- a location such as a URL
- a derivative version of a content object may be non-identical to an original version of a content object.
- content object 704 is provided as input to originality analysis block 702 .
- content object 704 is one of a plurality of content objects that is provided as input to originality analysis block 702 .
- a crawler such as crawler 112 ( FIG. 1 ) may crawl the Internet to catalog which content is original (and which is not).
- users designate content sources for capture and comparison.
- the matching content is provided by a matching engine 706 , such as matching engine 120 ( FIG. 1 ).
- the matching content is content that matches content object 704 based on criteria such as similarity scores and non-compliance scores, as described above. Depending on content object 704 , there may or may not be matching content.
- Originality analysis block 702 analyzes content object 704 , originality factors 712 , and matching content.
- Originality factors 712 include any factors that may affect the originality of the content object. Examples of originality factors 712 include: whether the originality of the content object has been challenged by another user, whether the claimed owner of the content object is registered or authenticated, and any third party verification of an originality factor—such as, where a third party content host presents the content with an indication that the user has a paid subscription to the hosting service, which may indicate that the user is not anonymous and therefore more likely to be the claimed rights holder. Besides originality factors 712 , other originality factors may be analyzed. For example, other originality factors may be derived from content object 704 (or the matching content), such as whether content object 704 is a subset or superset of another content object and the presence or absence of attribution. Originality factors are described more fully below.
- originality analysis block 702 determines an originality score for content object 704 and each matching content object.
- Originality determiner 708 makes a determination of which of the content objects, if any, are original content objects.
- An originality determination 710 is provided as output.
- Originality determination 710 may be made based on a variety of rules and do not necessarily result in the actual original content object. For example, in some cases, originality determiner 708 selects the content objects that are the most likely to be an original, e.g., based on one or more rules.
- “Deemed Original” status is published in a Usage Rules Summary associated with each registered content page, as a web service, available to third parties such as search engines, in a visible badge that may be coded by users into their registered content pages and feeds, on match results pages when the match query includes the registered content, and/or as part of publisher reputation information that is provided to hosts with remedy requests.
- Deemed Original status provides an originality verification for licensing and revenue sharing transactions involving the registered content.
- Presentation on match results pages allows potential content licensees to find rights holders more easily.
- Originality scores can be used for ranking search engine results, filtering search spam and claiming rights to contextual ad revenue, as described more fully below. Individual authors and creators may take pride in the distinction of having content that is a Deemed Original.
- Deemed Original status for any content object is noted in appropriate parts of their interface.
- a user must subscribe (and become a paying user) to have their Deemed Original Status publicly available.
- the user upon initial designation of a content source for monitoring, the user is provided with a link to an inventory of all content objects that are Deemed Originals. This may provide an opportunity to communicate immediate benefits of subscription at the time of registration, even prior to identification of any actionable matches.
- a Source Detail Page includes visual cues (such as highlighting) to indicate Deemed Original content objects. This view also is publicly available from a Usage Rules Summary for the related page.
- FIG. 8 is a flowchart illustrating an embodiment of a process for performing an originality determination. This process may be implemented on system 700 , for example.
- a content object is received.
- a request for a determination of originality of a content object is received.
- the request may be made by a content owner, a content host, a content user, or a crawler.
- the request may be made by a user who is interested in using the content object and would like to know who owns the content object.
- a content host may request a determination of originality so that it can provide a visual indication of originality when displaying a content object.
- search engine results may display search results associated with original content objects above other search results.
- an originality determination is made. Determining originality includes analyzing one or more originality factors related to the content object and determining automatically whether the content object is original based on the analysis of the originality factors. Examples of originality factors are more fully described below.
- the determination is output. In some embodiments, the determination is displayed in a user interface. The user interface may display a visual indication of whether the content object is a deemed original. In some embodiments, the originality score is displayed or accessible to the user. In some embodiments, the originality score is hidden from the user.
- the originality determination for a content object has already been made and is stored (e.g., as metadata for the content object, as described below). In this case, the determination is retrieved at 804 .
- FIG. 9 is a flowchart illustrating an embodiment of a process for making an originality determination. This process may be implemented by system 700 and may be used to perform 804 .
- it is determined whether there are any matching content objects. If it is determined that there are no matching content objects, then an originality score is computed for the content object at 902 .
- the content object is designated as a deemed original based on the originality score for the content object. For example, a threshold may be specified such that if the originality score for the content object is above the threshold, the content object is designated as a deemed original.
- an originality score is computed for the content object and the matching content objects.
- the content object corresponding to the highest score (of the original and the matching content objects) is designated as a deemed original.
- the content object corresponding to the earliest time of first appearance is selected.
- the originality score is outputted. As previously described, there may be more than one instance of an original content object. In some embodiments, if two or more content objects have matching or similar originality scores and/or are above a threshold, then these content objects are all Deemed Originals.
- the originality score is stored, for example, as metadata for the content object. In some embodiments, the originality score for all the content objects is stored, which may be desirable so that calculating the scores does not need to be repeated at a future time. In some embodiments, an indication of whether a content object is deemed original is stored. In some embodiments, the content objects that are not deemed original are deemed unoriginal, and this is stored. For example, a content object may be deemed original, deemed unoriginal, or not yet tested for originality.
- FIG. 10 is a flowchart illustrating an embodiment of a process for computing an originality score for a content object.
- this process is used to perform 902 and/or 908 .
- This process may be implemented on system 700 .
- the process begins at 1002 , at which it is determined whether the content object is registered.
- a content object may be registered by the owner of the content object. In this example, a content object must be a deemed original in order to be registered. Thus, if it is determined that the content object is registered at 1002 , then at 1004 , an indication of registration is outputted.
- originality factors related to the host may be analyzed. For example, if a user browsing the web comes across a content object hosted by a content host, the user may request that an originality determination be made of the content object. Originality factors related to the content host may then be analyzed, including, for example, whether the content host is registered or is legally bound, as more fully described below.
- the claimed owner may be analyzed. For example, a content owner may upload a content object and request to register the content object. Before the content object can be registered to the content owner, an originality determination is made of the content object. Originality factors related to the content owner may then be analyzed, including, for example, whether the content owner is registered or is legally bound, as more fully described below.
- the content object may both have a claimed owner and be hosted.
- a professional photographer's photo may be displayed on a news website at URL X.
- the photographer may then request to register the photo located at X.
- an originality determination is made of the photo, including analyzing both the content owner (i.e., the photographer) and the content host (i.e., the news website). If, for example, the news website is registered and legally bound, then the content object would have a higher originality score.
- Other originality factors include, for example, historical information associated with the content object, attribution associated with the content object, and the quality of the content object. Examples of analyzing other originality factors are described below with respect to FIG. 12 .
- an originality score is computed based on the analysis at 1006 and/or 1008 . For example, if the claimed owner is registered, then the originality score is higher. If there is an attribution to the content object, then the originality score is higher. Further examples are provided below.
- the score is outputted.
- FIG. 11 is a flowchart illustrating an embodiment of a process for analyzing originality factors related to the host and/or claimed owner of a content object. For example, this process may be used to perform 1006 . This process may be implemented by system 700 .
- the process begins at 1101 , at which the reputation of the host or claimed owner is checked.
- reputation is based on past behavior of a host or owner. For example, a host that always attributes content has a higher reputation than a host that does not attribute content. A content owner whose ownership is frequently challenged has a lower reputation than a content owner whose ownership has never been challenged. Reputation may be with respect to one or multiple systems. In some embodiments, a higher reputation corresponds to a higher originality score. Reputation is more fully described below with respect to FIG. 13 .
- the host or claimed owner is registered.
- content hosts are registered as hosts and content owners are registered as owners.
- hosts and owners are registered as the same entity.
- registration comprises providing identification (e.g., name, email address, residence address, social security number, credit card number, etc.). In some embodiments, the more identification provided, the higher the originality score.
- the host or owner must be registered in order to be authenticated and must be authenticated in order to be legally bound. (Other embodiments may vary.) Therefore, if the host or owner is not registered, then the result that the host or owner is not registered is output at 1108 . If the host or owner is registered, then it is determined whether the host or owner is authenticated at 1104 . In some embodiments, authentication comprises verification of identification (e.g., verifying that a credit card number is valid). In some embodiments, the user is a paying subscriber, whose identity may be authenticated through a credit card transaction, identity verification service, direct enterprise contract, or other identify verification method.
- identification e.g., verifying that a credit card number is valid.
- the user is a paying subscriber, whose identity may be authenticated through a credit card transaction, identity verification service, direct enterprise contract, or other identify verification method.
- the results are output at 1108 . If the host or owner is not authenticated, then the results (from 1102 and 1104 ) are output at 1108 . If the host or owner is authenticated, then it is determined whether the host or owner is legally bound at 1106 . In some embodiments, a host or owner is legally bound if the host or owner has agreed to comply with one or more legal requirements. For example, the user has agreed to specified obligations or penalties for ownership claims made in bad faith. The results (from 1102 , 1104 , and 1106 ) are output at 1108 .
- FIG. 12 is a flowchart illustrating an embodiment of a process for analyzing originality factors. For example, this process may be used to perform 1008 . This process may be implemented by system 700 .
- the process begins at 1202 , at which the extent of duplication is analyzed. For example, if it is determined that there are matching content objects, then the number of matching content objects is used to compute the originality score for the content object. In some embodiments, the greater the number of matching content objects, the lower the originality score. In some embodiments, the number of matching content objects is used in combination with other factors, such as the number of matches that provide attribution versus the number of matches overall; the number of matches that pre-date versus post-date the first known instance of the subject content; and the number of pre-dating matches that contain attribution from other third party parties.
- Attribution may be direct or indirect attribution.
- An example of direct attribution is cnn.com attributing a news article to a media outlet, for example, by placing the name of the media outlet and/or a link to the media outlet's web location near the article.
- An example of indirect attribution is a blog attributing the news report to cnn.com.
- the more direct or indirect attribution to a particular content object the higher the originality score of the content object. Attribution may be determined using information extraction, natural language processing, and/or analysis of links (e.g., URLs).
- similarity to matches is analyzed, where similarity includes whether the content object is a subset of a match, a superset of a match, or identical to a match.
- a content object may comprise one paragraph from a news article (subset) or a content object may comprise the news article plus additional content (superset). Percentages or other measures may be computed to quantify the similarity. In some embodiments, if the content object is a subset of a match, then the originality score is lower. If the content object is a superset of a match, then the originality score is higher.
- a time associated with the content object is determined.
- the time could be a time or time stamp when the content object first appeared on a content host.
- the time is obtained using the Internet Archive.
- the earlier the time the higher the originality score.
- a time associated with the content, matching content, or attributing content is determined.
- the results are output.
- This flowchart is an example of various originality factors that may be analyzed.
- various other originality factors may be analyzed.
- the quality of the content object may be analyzed.
- An example of quality is the resolution of an image.
- the higher the quality of a content object the higher the originality score.
- FIG. 13 is a block diagram illustrating an example of originality factors related to the reputation of a host. For example, these factors may be analyzed at 1204 .
- web location 1300 is hosted by a content host.
- a web location may include a web page, feed, or any other method of transmission.
- Web location 1300 includes content objects 1 ⁇ N.
- n1 is the number of attributions to content object 1. Attributions may be in the form of text or links.
- n2 is the number of attributions to content object N.
- n3 is the number of attributions to web location 1300 .
- n4 is the number of attributions from web location 1300 to other web locations, where the attributions are used to attribute content on web location 1300 .
- n1, n2, n3, and/or n4 the greater the reputation of the host. Stated another way, if there are many attributions to web location(s) of a host (n3) or to content objects on web location(s) of a host (n1, n2), then the host's reputation goes up. Similarly, if there are many attributions from web location(s) of the host to other sources (n4), then the host's reputation goes up. In other words, a host that tends to attribute content has a good reputation.
- n1, n2, and n3 may include direct and/or indirect attribution. In some embodiments, direct attribution is weighted differently or more heavily. In some embodiments, since a content owner may be associated with a host, the content owner's reputation is based on the host's reputation and vice versa.
- reputation is based not just on the tendency to attribute content, but also the tendency to attribute content properly or consistently with instances of attribution to the same source provided by other properties. In other words, in some embodiments, improper attribution does not necessarily increase the positive weight of the reputation.
- a host or owner's reputation is the number of times or frequency that ownership of content by the host or owner is challenged. In some embodiments, the fewer times the host or owner has not been challenged before, the higher its reputation.
- a challenge to a claim of ownership of content that has been designated as a “Deemed Original” changes its designation to a lower level of authentication (such as “Qualified Original”)
- a web page or other collection of one or more content objects is given a designation to indicate that each of the content objects on the Web page or in the collection is either: (1) original or (2) non-original and properly attributed.
- FIG. 14 is a flowchart illustrating an example usage of a system for determining originality, such as system 700 .
- a search engine displays search results based on the originality of the search results.
- a search engine may use an API that returns originality determination results for content objects.
- a search engine may perform the originality determination.
- search results are displayed based on results of the originality determination.
- the search results may be displayed in an order that takes into account the originality of content in the search results. Search results with original content would be displayed higher than search results with less original content (e.g., search spam).
- search results could be sorted and displayed based at least in part on originality scores associated with content in the search results.
- search results with original or unoriginal content could be flagged or some other visual cue used.
- matches of Deemed Originals or content objects with higher originality scores are treated as higher priority in sorting of match results (since they are potentially of more interest to searchers), and/or may be provided as a more advanced search sorting filter.
- content is included or excluded from search results based at least in part on the originality determination.
- content is presented with an indication of originality, which indication reflects a query, which may be real-time, to a third-party system.
- this originality determination accounts for the presence of non-original content which does not lower the originality score so long as the non-original content is properly attributed, for example to a third party source(s).
- Originality determination information may be used in a plurality of other applications.
- an advertising network uses the determination to ensure that advertising revenue or other benefit is paid only to authorized persons, or that advertising revenue or other benefit is paid in proper amounts to one or more authorized persons.
- an advertising system e.g., Google AdSense
- Google AdSense may use originality determination information for screening purposes.
- the advertising system does not provide advertising revenue to websites that do not provide original content. This prevents search spam sites from receiving advertising revenue.
- third party users may challenge a registered user's claim of ownership to a content object and their Deemed Original status.
- any challenger must have registered the related content and provided identity verification. The system does not need to make any judgment in such a situation, but may capture and publish information from each party along with other relevant information (such as time of first appearance).
- a challenger may be required show good faith by agreeing to specified dispute resolution rules. These rules could include legal fee shifting (loser pays attorneys' fees), liquidated damages (e.g., a fixed minimum amount), and/or arbitration (e.g., rapid, non-appealable resolution of the dispute).
- legal fee shifting loser pays attorneys' fees
- liquidated damages e.g., a fixed minimum amount
- arbitration e.g., rapid, non-appealable resolution of the dispute.
- “Deemed Original” status may be revoked.
- whether and how the challenging user can thereafter acquire “Deemed Original” status is configurable.
- the “Deemed Original” process uses historical search results from an archive, such as the Internet Archive.
- the Internet Archive could be queried and results cached at the time of registered content capture. Where available, challenge participants can be encouraged to provide Archive references in their dispute statements.
- a separate terms-of-service disclosure and acceptance process is required to activate the publication of Deemed Original status.
- the disclosure may highlight the dispute resolution terms, and any uncertainties associated with issues like enforceability and conflict of laws.
- the terms of service provide that multiple challenges to a user's Deemed Original status may result in account termination. Such conditions may trigger human review by a staff member.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- Software Systems (AREA)
- Signal Processing (AREA)
- Computer Hardware Design (AREA)
- Technology Law (AREA)
- Business, Economics & Management (AREA)
- General Business, Economics & Management (AREA)
- Computing Systems (AREA)
- Computer Networks & Wireless Communication (AREA)
- Information Transfer Between Computers (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
This disclosure relates, e.g., to governing distribution of content on a web-based service. One aspect of the disclosure relates to a system comprising various interfaces, e.g., for: i) receiving content posted to a web-based service, for distribution by the web-based service to the public, and ii) presenting for review one or more items of user-posted content hosted by the web-based service that are identified as a match with identified copyrighted content, and receiving, via an interface for use in confirming the match, information regarding the match, ii) distributing the received content from the web-based service along with an attribution associated with the identified copyrighted content. A great variety of other aspects, claims, features and arrangements are also detailed.
Description
This application is a continuation of application Ser. No. 14/271,297, filed May 6, 2014 (published as US 2014-0259097), which is a continuation of Ser. No. 14/258,633, filed Apr. 22, 2014, which is a continuation of application Ser. No. 11/655,748, filed Jan. 19, 2007 (now U.S. Pat. No. 8,707,459). The Ser. No. 14/271,297 application is also a continuation-in-part of application Ser. No. 11/512,067, filed Aug. 29, 2006 (now U.S. Pat. No. 8,738,749). These patent documents are incorporated herein, in their entireties.
BACKGROUNDContent, such as text, images, and video, may be stored and displayed on the Internet. For example, an online service provider (OSP), such as Google or YouTube, may display images as a result of a text based image search or video posted by users. There are many cases in which content on the Internet is being used in a non-compliant way. Non-compliant content may include material that violates third party copyrights or trademarks, is illegal (e.g., child pornography), or otherwise does not comply with a content owner's terms of use or with an OSP policy. Examples of potentially non-compliant use of content include bloggers copying text from news reports, eBay sellers copying other seller's listing content, aggregators republishing listings from other sites, spammers using copyrighted text to create web pages to influence search results and generate advertising revenue, or even innocent/accidental use of non-compliant content by a conscientious consumer.
Content on the Internet is difficult to monitor for compliance. Typically, a content owner manually monitors the Internet for copies of the owner's content through repetitive queries in search engines like Google. In some cases, the use of the owner's content is permissible under their own license terms or under legal principles such as the copyright concept of “fair use,” which considers such factors as whether attribution has been provided, what portion of the content has been used without permission, and whether the content has been used for commercial purposes (such as generating advertising or subscription revenue). Content owners have no automated methods to evaluate the context in which their content is used by others.
Even when non-compliant use of content is detected, typically it is difficult to remedy. In the case of copyright non-compliance, the content owner's objective usually is to cause the content to be removed from third-party services that host the content or search engines which refer users to it through their indices. This typically is a manual process which involves submitting a notice under the Digital Millennium Copyright Act (DMCA). The DMCA provides OSPs and search engines with a safe harbor from copyright infringement liability if they promptly remove content from their service upon request by the content owner. Therefore, when a content owner finds a copy of his content, he can choose to send a take down notice under DMCA by writing a letter or an email to the OSP or search engine. In response, the OSP or search engine typically must manually remove the content from their service to avoid liability.
From an OSP's perspective, monitoring for content that does not comply with the OSP's host policy is also typically a manual process. When OSPs monitor content as it is uploaded, typically a human views and approves content before (or after) it is displayed and non-compliant content is rejected (or removed). OSPs also must manually review and compare content when they receive DMCA notices, and often have little information to determine if content is out of compliance and no automated way to determine the identity or reputation of the complaining party. As the amount of content on the Internet grows, manual content monitoring and enforcement processes are becoming increasingly impractical. Therefore, improved methods for monitoring content and managing enforcement of non-compliant content are needed. In addition, there currently exists no means to automatically verify content ownership, e.g., for the purpose of facilitating the negotiation, transaction, and/or enforcement of content license(s), and a solution to this problem would also be desirable.
BRIEF DESCRIPTION OF THE DRAWINGSVarious embodiments of the invention are disclosed in the following detailed description and the accompanying drawings.
is a block diagram illustrating an embodiment of a content monitoring system.
is a flow chart illustrating an embodiment of a process for monitoring content.
is a flow chart illustrating an embodiment of a process for monitoring for use of controlled content.
is a flow chart illustrating an embodiment of a process for evaluating context of a content object.
is a flow chart illustrating an embodiment of a process for monitoring for use of controlled content.
is a flow chart illustrating an embodiment of a process for engaging with a user of non-compliant content.
is a flow chart illustrating an embodiment of a process for displaying compliance information.
is an example of a graphical user interface (GUI) for providing controlled content.
is an example of a GUI for providing controlled content.
is an example of a GUI for providing usage rules.
is an example of a GUI for displaying search results.
is an example of a GUI for displaying use of a content object.
is a block diagram illustrating an embodiment of a system for making a determination of originality of content.
is a flowchart illustrating an embodiment of a process for performing an originality determination.
is a flowchart illustrating an embodiment of a process for making an originality determination.
is a flowchart illustrating an embodiment of a process for computing an originality score for a content object.
is a flowchart illustrating an embodiment of a process for analyzing originality factors related to the host and/or claimed owner of a content object.
is a flowchart illustrating an embodiment of a process for analyzing originality factors.
is a block diagram illustrating an example of originality factors related to the reputation of a host.
is a flowchart illustrating an example usage of a system for determining originality.
The invention can be implemented in numerous ways, including as a process, an apparatus, a system, a composition of matter, a computer readable medium such as a computer readable storage medium or a computer network wherein program instructions are sent over optical or communication links. In this specification, these implementations, or any other form that the invention may take, may be referred to as techniques. A component such as a processor or a memory described as being configured to perform a task includes both a general component that is temporarily configured to perform the task at a given time or a specific component that is manufactured to perform the task. In general, the order of the steps of disclosed processes may be altered within the scope of the invention.
A detailed description of one or more embodiments of the invention is provided below along with accompanying figures that illustrate the principles of the invention. The invention is described in connection with such embodiments, but the invention is not limited to any embodiment. The scope of the invention is limited only by the claims and the invention encompasses numerous alternatives, modifications and equivalents. Numerous specific details are set forth in the following description in order to provide a thorough understanding of the invention. These details are provided for the purpose of example and the invention may be practiced according to the claims without some or all of these specific details. For the purpose of clarity, technical material that is known in the technical fields related to the invention has not been described in detail so that the invention is not unnecessarily obscured.
is a block diagram illustrating an embodiment of a content monitoring system. In some embodiments,
content monitoring system100 is used by a content owner to monitor for non-compliant use of the content owner's content based on usage rules specified by the content owner. Examples of content owners include: a photographer (e.g., Ansel Adams), a film studio (e.g., Disney), or a columnist (e.g., Walter Mossberg), or a media outlet (e.g., The Wall Street Journal). The content owner is not necessarily the same as the content creator. Usage rules (including usage policies, terms of use, usage terms, etc.) are a set of rules regarding conditions under which content may be used, as specified by the content owner. Usage rules may vary depending on the content and/or the content owner and applicable law (such as “fair use”). Usage rules are more fully described below.
In some embodiments,
content monitoring system100 is used by a content host to monitor for non-compliant use of content based on a host policy specified by the content host. A content host refers to an entity that hosts, serves, stores, provides, and/or displays content. Examples of content hosts include OSPs, such as search engines (e.g., Google), photo or video sharing websites (e.g., YouTube, Yahoo), and blogging sites (e.g., TypePad). As used herein, an OSP is an entity that hosts and/or serves or provides content on behalf of itself or other entities. For example, an OSP includes an OSP as defined under DMCA. An OSP includes an electronic content management system (ECM). A host policy is a set of rules regarding conditions under which content may be hosted, as specified by a content host. A host policy may vary depending on the content host. As an example of a host policy, OSPs may have policies that apply to the posting of content by their users, in which they reserve the right to remove content or users in the event of non-compliance (determined at their discretion). In some embodiments, a configurable host policy governs the automatic handling of DMCA notices, as more fully described below.
A content user includes an entity that uses content that is not owned by the content user. A content user includes an entity that owns or posts content. Examples of content users include writers, photographers, bloggers, or any user who posts content on content hosts.
Controlled content refers to content associated with one or more compliance rules, where compliance rules include usage rules specified by a content owner and/or host policy rules specified by a content host. In the case where a content owner is monitoring for use of his content, controlled content is the content owner's content. In the case where a content host is monitoring for non-compliant content, controlled content is content that is non-compliant with the host policy. Monitored content refers to the set of content being searched (i.e., potential matches). In other words,
content monitoring system100 searches monitored content for use of controlled content. As used herein, a match, copy, or use of controlled content does not necessarily refer to an identical match, an identical copy, or use of identical content. A match, copy, or use of controlled content is identified based on criteria such as similarity scores and non-compliance scores, as more fully described below.
Compliant content refers to content that satisfies usage rules associated with the content. In the case where a content host such as an OSP is monitoring for non-compliant content, compliant content refers to content that not only satisfies the usage rules, but also satisfies the host policy of the content host (e.g., the OSP).
As used herein, a unit of content may be referred to as a content object. Content objects can include any object type. Examples of content objects include a text document, an image, video, audio, flash, animation, game, lyrics, code, or portions thereof (e.g., a phrase/sentence/paragraph, a subimage, or a video clip). Other examples include a single file (e.g., an image), all of the text on a web page (e.g., a news article), a chapter in a book, and a blog entry. The content object may be in various audio, image, or video formats, such as MP3, JPEG, MPEG, etc.
100 can be used to find copies of a set of content at a given point in time or regularly monitor for matches.
Content monitoring system100 may be used to monitor data associated with the Internet or any other appropriate environment in which there is a need to monitor content for compliance. Examples of appropriate environments include the Internet, an Intranet, a firewalled network, a private network, an Electronic Data Interchange (EDI) network, an ad hoc network, etc.
As shown,
user102 provides input to
ingestor104.
Ingestor104 provides input to subscriber database 105,
content database108, and
crawler112.
Reporter110 receives input from subscriber database 105 and
content database108.
Crawler112 provides input to
digester114.
Digester114 provides input to
content database108, controlled
content store116, and monitored
content store118.
Matching engine120 provides input to controlled
content store116 and monitored
content store118.
Content database108 interacts with matching
engine120.
104 accepts controlled content from
user102.
User102 includes content owners or administrators of
content monitoring system100. The content may be specified in various ways. A user interface (UI) may be provided for
user102 to specify content. In some embodiments, the UI provides an interface for uploading content or specifying a link/set of links to the content, where the links may be local (e.g., on a local hard drive) or remote (e.g., on a remote server or on the Internet). An example of a remote link is a user's eBay account.
User102 may display, in his eBay store, images to be monitored. For example,
user102 is a photographer selling his photography. Using the UI,
user102 specifies a URL to the eBay store or particular auction. In some embodiments, instead of providing a URL to a particular auction, the content owner provides their username (such as an eBay seller ID), which allows the system to retrieve all of the user-posted content associated with that username, which could be associated with one or more auctions. In some embodiments, the content owner also provides a password if necessary or expedient to locate user-posted content. In some embodiments, a schedule for fetching content may be specified. For example,
crawler112 may be configured to fetch images from the user's eBay store every 24 hours. The raw content is passed to digester 114 for processing and storage.
In some embodiments, the ingesting of content is automatically triggered by content creation. For example, when a blogger posts a new entry, it is automatically ingested. When a writer updates a Word document, the content is automatically ingested.
In some embodiments, if the URL or username provided by the content owner contains some content of third parties, the user is presented with a means to exclude or include specific content objects (such as a single image) from monitoring and from the content owner's usage rules.
The controlled content may be from the Internet or from another source. A manual or automated API may be used to ingest content or perform any of the other processes described herein. A URL or any other appropriate identifier may be used to specify content. Credentials associated with accessing the content, such as a password, may be provided.
Besides controlled content, other data may be provided as input to
content monitoring system100, such as links (e.g., URLs or websites) identified by an administrator, content host, or content owner. These sites may have been identified because the user is aware of a specific instance of non-compliance at that location, they have historically posted non-compliant content or are of particular concern to the user. Other examples of additional data that may be input to
content monitoring system100 are more fully described below.
112 fetches content from the network. The content to be fetched may include the Internet, a subset of the Internet, a complete domain, or a single piece of content from the web. Identifiers may be used to identify the content to be fetched. Some examples of identifiers include: a URL, a directory, a password protected website(s), all items for a seller on eBay, and all content of a given type or format (e.g., images only or JPEGs only). In some embodiments,
crawler112 is used with modules that provide different rules for crawling. In some embodiments,
crawler112 fetches content according to a specified schedule.
Controlled
content store116 includes controlled content. In some embodiments, controlled
content store116 includes the following information: a copy of the content, an index of fingerprints associated with the content, and metadata about the content (e.g., filename, URL, fetch date, etc.). In some embodiments, the copy of the content is stored in a separate cache. A fingerprint includes a signature of an object that can be used to detect a copy of an object as a whole or in part. A content object may have more than one fingerprint. A fingerprint may be associated with more than one content object. A fingerprint may be associated with a whole or part of a content object. A fingerprint may be multidimensional. For example, there may be multiple features associated with a fingerprint. A fingerprint may contain multiple fingerprints or subfingerprints.
118 is a repository for crawled data.
Monitored content store118 may include any digital object collection or environment. In some embodiments, monitored
content store118 is a web store. In some embodiments, there are multiple content stores, e.g., one for each kind of data—text, images, audio, video, etc. In some embodiments, monitored
content store118 includes data from sites that copy the most often, and is updated most frequently. This data may be indicated as such (i.e., tagged or flagged as common copier) or stored separately. In some embodiments, a real-time store (not shown) is used to store various feeds coming in (e.g., from a content owner's blog each time the blog is updated, or from a content owner's eBay store every 24 hours). In some embodiments, a ping server or similar server is used to update feeds coming in. If the feeds contain links, the content is fetched by
crawler112. Over time, data moves from the real-time store to monitored
content store118 as it becomes older.
Monitored content store118 changes periodically, whereas the real-time store keeps changing as content comes in. In some embodiments, external stores (not shown), such as search engines, are accessed using application programming interfaces (APIs). Once data is fetched, they are stored in monitored
content store118. Some embodiments of this are more fully described below. In some embodiments, fingerprints of content are stored in monitored
content store118. In some embodiments, Gigablast is used to fetch and store content data.
114 receives content fetched by
crawler112, including controlled content or monitored content, analyzes, and processes it. Analysis of content is more fully described below. The content and associated metadata is stored in controlled
content store116 or monitored
content store118, as described above.
In some embodiments, matching
engine120 finds matches to controlled content by comparing controlled content from controlled
content store116 with monitored content from monitored
content store118 based on matching techniques including technical factors, compliance factors, and other factors, as more fully detailed below.
110 reports match results to
user102 or an administrator of
content monitoring system100. Various user interfaces may be used. Examples of reporting and UIs for reporting results are more fully described below.
106 contains information about customers.
Content database108 contains references to controlled content and to matched content corresponding to the controlled content. In some embodiments, a separate database is used for matched content.
In some embodiments,
content monitoring system100 is used as a content clearinghouse by content users wishing to use content. Before using a particular content object (i.e., unit of content), the content user checks with
content monitoring system100 to determine whether the conditions under which the content user wishes to the use the content complies with the usage policy set by the content owner.
100 may be implemented in various ways in various embodiments. For example, controlled content, web data, subscriber data, and/or content data may be organized and stored in one or more databases. Ingesting, crawling, digesting, matching, and/or reporting may be performed using one or more processing engines.
In some embodiments, any of the functions provided by
content monitoring system100, such as ingesting, crawling, digesting, matching, and reporting, may be provided as a web service. For example,
content monitoring system100 or an element of
content monitoring system100 is queried and provides information via XML.
is a flow chart illustrating an embodiment of a process for monitoring content. In some embodiments, this process is performed when a content owner is searching or monitoring for non-compliant use of the owner's controlled content. In some embodiments, this process is performed by
content monitoring system100.
In the example shown, the process begins at 202, and controlled content is specified. Controlled content may include text, images, video, or any other type of data. Controlled content may be specified in various ways, such as content located in a particular directory and/or all content contributed by a particular user (e.g., on eBay). A user (e.g., a content owner or an administrator) may specify controlled content using any appropriate interface. Examples of graphical user interfaces are described more fully below. The user may also request a one time search or regular monitoring for the controlled content. In the case of the latter, the user may specify options related to regular monitoring, such as frequency of monitoring, how often reports should be received, etc.
At 203, usage rules are specified. Usage rules include conditions under which a content owner permits the use of owned content. Usage rules may include terms under which a content owner permits the republication and/or modification of content. Usage rules may include different conditions depending on whether the use is for commercial or non-commercial uses, business or education uses, with or without attribution, in a limited amount, in a limited context, etc. The usage rules may be based on any appropriate compliance structure, such as “fair use,” “copy left,” “share alike,” Creative Commons specified structures, user specific compliance rules, rules against associating the controlled content with objectionable content (e.g., obscenity, adult content, child pornography), rules requiring attribution, moral rights, rights of personality, or any legal or personal compliance structure. A usage rule may take into account editorial context. In other words, certain uses may be permitted that are not permitted in another context. For example, if the controlled content object is a book, portions from the book may be permitted to be used in a book review but not in another context (where other rules may apply).
A variety of user interfaces may be used to specify usage rules. For example, a list of terms, checkboxes (to apply a rule), and settings (specific to a rule) may be provided. The list may include, for example: whether attribution is required, amount of duplication allowed, whether commercial use is allowed, whether changes are allowed, whether permission is required, whether derivative content is allowed, geographical requirements, whether the owner requires advertisement revenue sharing (e.g., using Google AdSense) and associated terms and information, etc. The usage rules may be hierarchical. For example, a list of higher level rules or compliance structures may be displayed for selection, each of which may be expanded to display lower level rules that each of the high level rules comprises. Usage rules may have any number of levels. Checkboxes (or another appropriate object) may be located next to the higher level or lower level rules and may be selected (e.g., checked off) at any level of granularity. For example, selecting checkboxes next to a higher level rule automatically selects all corresponding lower level rules. Alternatively, lower level rules may be individually selected. An example of a higher level rule is a particular type of license. Lower level rules under the license include the specific usage rules associated with the license.
Usage rules may be customized for each content owner (and for each content object). In some embodiments, a unique URL is provided to the content owner for his use (e.g., to include as a link associated with an icon placed in proximity to his content on his website, in his eBay store, etc.) When a content user wishes to use content on the content owner's website, the content user can then select the link, which leads to a page describing the content owner's usage rules (for that content object).
In some embodiments, rather than providing a unique URL to the content owner, the content owner could use a particular URL on his website or web page. For example, the particular URL could be “rules.attributor.com.” When a content user wishes to use content on the content owner's website, the content user can select the link, which leads to a page describing the content owner's usage rules (for the website or content on the website). In this case, the content monitoring system determines from which website the link was selected and can determine which usage rules to display. In some embodiments, the same URL is common to multiple content owner's websites. Further examples are discussed below.
Usage rules may be stored in the content monitoring system. For example, the usage rules for content owners may be stored in controlled content store 116 (e.g., as metadata associated with the content object) or in
subscriber database106.
At 204, controlled content is acquired. In some embodiments, 204 is performed by
ingestor104 in
system100. In various embodiments, controlled content is obtained from a source specified at 202. For example, controlled content is obtained from a particular directory or from one or more servers containing content contributed by a particular user. Controlled content acquisition may be automated or non-automated. For example, an automated process could poll for updates and acquire controlled content when an update is detected. In some embodiments, a ping server is used to detect updates. In some embodiments, controlled content is continuously acquired or ingested. For example, if the controlled content is specified as all content contributed by a particular user on eBay, then when the user contributes new content to eBay, that content is automatically acquired or acquired at configured times or time intervals. A variety of APIs may be used to acquire controlled content. In some embodiments, after controlled content is acquired, the user is given an opportunity to confirm that it is the correct controlled content or the controlled content the user intended. The acquisition of controlled content may involve any network, protocol (e.g., UDP, TCP/IP), firewall, etc.
At 206, controlled content is analyzed. In some embodiments, 206 is performed by
digester114 in
system100. In some embodiments, the acquired content is analyzed for unique identifying features. Any appropriate technique may be used to extract features from the content. For example, a fingerprint associated with the content may be determined. The technique may depend on the media type (e.g., spectral analysis for audio/video, histogram or wavelets for images/video, etc.) For example, in the case of text content, various techniques may be used, such as unique phrase extraction, word histograms, text fingerprinting, etc. An example is described in T. Hoad and J. Zobel, “Methods for identifying versioned and plagiarized documents,” in Journal of the AMERICAN Society for Information Science and Technology, Volume 54,
Issue3, 2003. In the case of image content, various techniques may be used, including key point identification, color histograms, texture extraction, image signatures, or extraction of any other feature. An example is described in Y. Ke, R. Sukthankar, and L. Houston, “Efficient near-duplicate detection and sub-image retrieval,” in ACM Multimedia. ACM, October 2004, pp. 1150-1157. In the case of video content, a video fingerprinting technique may be used. In another example, a signature is formed for each clip by selecting a small number of its frames that are most similar to a set of random seed images, as further described in S.-C. Cheung, A. Zakhor, “Efficient Video Similarity Measurement with Video Signature,” Submitted to IEEE Trans. on CSVT, January, 2002. In the case of audio content, an audio fingerprinting technology may be used. For example, a spectral signature is obtained and used as input to a hash function. In various embodiments, other techniques may be used. Analyzing may include determining spectral data, wavelet, key point identification, or feature extraction associated with the controlled content. In some embodiments, results from the analysis are stored in controlled
content store116 in
system100.
At 208, monitored content is searched for use of controlled content. In some embodiments, monitored content is specified by a user, such as a content owner or administrator. The entire web may be searched, or a subset of the web (e.g., websites that have been identified as sites that copy the most often or data in a content store such as monitored content store 118). A database of sites that have been crawled and resulting data may be maintained that is updated at various times. Rather than searching the entire web, the database may be used instead. Searching may comprise a combination of searching the web and consulting a database of previously crawled websites. In some embodiments, monitored
content store118 in
system100 stores previously crawled websites. In some embodiments, 208 is performed by
crawler112 in
system100.
Searching may be performed in one or more stages, each stage refining the search further. For example, a first search may yield a first set of candidate content objects. A second search searches the first set of candidate content objects to yield a second set of content objects, and so forth. Eventually, the final set of content object(s) includes the content object(s) that match or most closely match the controlled content object. In some embodiments, less expensive and/or less complex techniques may be used to obtain candidate sets followed by one or more tighter, smaller granularity techniques to progressively enhance the resolution of the analysis. Which techniques may be used and in which order may be determined based on cost and/or complexity. In some embodiments, the second search comprises a manual search. For example, the second set of content objects may be a smaller set and may be searched by a human.
In some embodiments, a hash structure is used to obtain a candidate set of content objects. For example, a hash table is maintained such that similar content objects are hashed to the same or a nearby location in a hash table. This way, to search for content object A, a hash function associated with A is computed and looked up in a hash table, and a set of objects that are similar to A is obtained. A hash function associated with a content object may be computed in various ways. The hash function may be computed differently depending on the type of content object or one or more characteristics of the content object. For example, if the content object is a text document, a fingerprinting technique specific to text may be used to obtain a fingerprint of the document. The fingerprint may be input to the hash function to obtain a hash value that corresponds to a group of other content objects that have a similar fingerprint. Hash values that are nearby in the hash table correspond to content objects that have similar (though less similar than those in the same hash bin) fingerprints, to create a clustering effect. In this way, a candidate set of content objects may be obtained.
Other techniques such as cosine similarity, latent semantic indexing, keyword based methods, etc., may also be used.
In some embodiments, existing search engines or search facilities on websites, such as eBay, are used to obtain a candidate set of documents. This approach may be useful in an initial implementation of the system. For example, APIs provided by Google or other search engines may be used to perform this search. For example, to search for a document, a unique phrase within the document is selected. The unique phrase is input to a Google search using a Google API and the results are a candidate set of documents. Multimedia search engines (e.g., video, image) may be used to obtain a candidate set of documents. In the case of images, an image search engine may be used to obtain a candidate set of images. For example, Riya (www.Riya.com) includes an image search engine that may be used to obtain a candidate set.
In some embodiments, besides the Internet, databases may be searched using these techniques. Some examples of databases include Factiva, Corbis, and Hoover's. Although these databases do not allow indexing of their documents, they do have a search interface. This search interface may be used to perform searches for content using unique phrase extraction. For example, articles in the Factiva database containing a unique phrase from a controlled content object are more likely to be a match. A subsequent search may be performed by obtaining the full text of the articles and searching them using more refined techniques. Searching this way limits having to crawl the entire Internet. Also the more computationally intensive search techniques are limited to a smaller search space.
In some embodiments, once a candidate set of content objects is obtained, one or more refining searches are performed. For example, the candidate set of documents are crawled and advanced matching techniques can be applied to the candidate set of documents. A variety of content or document similarity techniques may be used. For example, the techniques described at 206 may be used on the candidate set of content objects.
In the case of text documents, a refining search may comprise computing a signature for each paragraph or other data set. A Levinstein distance could be used to determine the similarity between a document and the controlled content object. A byte by byte comparison could be used. Other techniques, such as anchoring or cosine similarity may be used, as described more fully in T. Hoad and J. Zobel, “Methods for identifying versioned and plagiarized documents,” in Journal of the American Society for Information Science and Technology, Volume 54,
Issue3, 2003. Techniques such as PCA-sift or feature extraction of color, texture and signature generation may be used. For example, A. C. Popescu and H. Farid, “Exposing Digital Forgeries by Detecting Duplicated Image Regions, Technical Report, TR2004-515, Dartmouth College, Computer Science describes examples of such techniques.
In the case of images, images may be subsampled to be robust against cropping and subimage reuse using techniques such as key pointing (or key point extraction), which looks for unique signatures within a portion of an image, such as edges or extreme color gradations, and samples these portions to obtain a signature. Another way is to subsample distinctive portions of a color histogram of the image.
In some embodiments, different techniques are used depending on characteristics of the content object. For example, if a document has fewer than 20 paragraphs, a byte by byte comparison may be used. If a document has 20 or more paragraphs, a different technique may be used. Sampling and anchoring points may depend on the format of the document.
At 210, use of controlled content is detected. In some embodiments, 210-213 are performed by matching
engine110 in
system100. In some embodiments, detection is based on various criteria associated with technical factors that may result from searching at 208. An example of a technical factor is a similarity score. A similarity score is a measure of the similarity between two content objects and may be computed in a variety of ways. For example, the Levinstein distance is a similarity score. In some embodiments, if similarity scores meet one or more criteria, use of controlled content is detected. The criteria may be configurable by the user or administrator. One or more similarity scores may be computed for a controlled object and candidate object to represent various characteristics of the content. In some embodiments, one or more similarity scores may be weighted and combined into a single similarity score.
A similarity score may account for various degrees of copying. For example, the first and last paragraph of a document may be copied, a portion of a document may be copied, or the whole document may be copied. Different samples of music may be copied into a single audio file. Videos may be mixed from copied videos. One controlled document may have 15 samples, one or more of which may be copied. A similarity score may account for these factors. For example, a copying extent score may be used to indicate the percentage of a controlled content object that has been copied. A copying density score may be used to indicate the percentage of a match that is comprised of a controlled content object.
At 212, a context associated with the use of the controlled content is evaluated. The context refers to any attribute associated with the use of the content object. For example, the context includes compliance factors, technical factors, and reputation information. Context may be automatically and/or manually determined.
Compliance factors are based on usage rules specified by content owners. For example, compliance factors include information related to attribution and commercial context. Examples of compliance factors include whether the site is government, education, commercial, revenue producing, subscription based, advertising supported, or produces revenue in some other way (e.g., using a reputation bartering scheme associated with a compensation mechanism). This can be determined manually or automatically. For example, a human could review the website, or based on the top level domain (e.g., .edu, .com, .org), or the presence of advertising related HTML code, it can be determined whether the website is commercial.
In some embodiments, a non-compliance score is computed to represent the likelihood that a content object is non-compliant based on the compliance factors. In some embodiments, multiple compliance factors are used to determine a non-compliance score. For example, the non-compliance score takes multiple compliance factors, normalizes and weighs each one as appropriate, and takes the sum. In some embodiments, the weighting is based on usage rules and/or host policy rules. In addition an overall weight may be used to scale the non-compliance score. For example, content found on educational sites may be weighted differently. One or more non-compliance scores may be computed.
Besides technical factors and compliance factors, examples of other factors include reputation information. For example, a reputation database is maintained that includes reputation ratings of content users by other content owners. For example, Bob's blog may have a low reputation because it has posted numerous copyrighted content objects owned by others who have given Bob's blog a low reputation rating.
At 213, matching content (i.e., match content object(s)) is identified based on detection at 210 and/or evaluation at 212. As previously described, a match, copy, or use of controlled content does not necessarily refer to an identical match, an identical copy, or use of identical content.
In some embodiments, a match is a technical match and is selected based only on technical factors, such as similarity scores. In this case, technical matches are identified at 210, and at 212, the technical matches are evaluated based on context to determine whether they are compliant.
In other embodiments, a match is selected based on configurable criteria associated with technical factors (e.g., similarity scores), compliance factors (e.g., non-compliance scores), and/or other factors (e.g., reputation information). In some embodiments, it is determined that content objects with one or more similarity scores that exceed a similarity score threshold and one or more non-compliance scores that exceed a non-compliance score threshold are matches. In other words, a content object that is technically similar, but is compliant with applicable usage rules, would not be considered a match. In some embodiments, it is determined that any content object with one or more similarity scores that exceed a similarity score threshold is a match.
In some embodiments, a binary flagging is used. For example, it is determined that content objects with one or more similarity scores that exceed a similarity score threshold and/or one or more non-compliance scores that exceed a non-compliance score threshold are “interesting” and other content objects are “non-interesting.” In some embodiments, “interesting” content objects are reported to the user at 214.
At 214, content is reported to the user (e.g., content owner). In some embodiments, which content to report is configurable and may depend on criteria based on technical factors (e.g., similarity scores), compliance factors (e.g., non-compliance scores), and/or other factors (e.g., reputation information). In some embodiments, matching content as identified at 213 is reported to the user. In some embodiments, a user views and manually confirms whether each matching content object is non-compliant. The results may be stored in a common database.
In some embodiments, 214 is performed by
reporter110 in
system100. Various interfaces could be used. Screenshots, links, buttons, tabs, etc. may be organized in any appropriate fashion. In some embodiments, a user interface is presented to the user that shows the matching content, one or more similarity scores, and one or more non-compliance scores. Example interfaces for reporting results are more fully described below.
In some embodiments, the interface provides a way for the user to confirm that content is the user's content or reject the content (i.e., indicate a false positive). This data may be fed back into the monitoring process. For example, this information may be stored in a database or with the content metadata. In some embodiments, the interface provides choices of actions for the user to select from (e.g., ask that the reusing party attributes it, offer license/licensing terms, remove under DMCA, etc.).
In some embodiments, 214 is not performed and the process continues at 216.
At 216, the user of the content is engaged. In some embodiments, user contact information is obtained from the IP address, the U.S. Copyright Office (e.g., a designated agent registered with the U.S. Copyright Office), or a known email address (e.g., of an OSP or a user of an OSP). A database or lookup table of contact information associated with various sites may be maintained and used to determine user contact information.
Depending on configuration settings, various types of communication may be sent to the content user. For example, a DMCA notice, information concerning usage rules, licensing information, etc. may be sent. For example, the content owner may have specified one or more usage rules associated with his content, such as “do not license any content,” “replace content with an advertisement,” “add watermark to content,” “add Unicode overlay,” “share advertisement revenue,” or “ask permission prior to use.” Based on the usage rules, an appropriate communication may be sent to the content user. In some embodiments, the content user is also configured to use the content monitoring system. The content user may have specified a set of compliance rules, such as “automatically debit my account up to $100 per year when licensed content is used,” “offer to share advertising revenue when contacted by content owner,” “remove content when contacted by content owner,” etc. Based on the compliance rules, an appropriate response may be sent back to the content owner. In some embodiments, an engagement communication may be configured to be sent in a way that preserves the anonymity of the sender of the engagement communication (e.g., the content owner, or a content host, as more fully described below).
An example of an engagement communication includes an email that is automatically sent to a content user notifying the user that the content is owned and offering to license it for $9.99 per year, and including a link to the content owner's usage rules hosted by the content monitoring system. The content owner may configure his settings so that the email is not sent to content users whose sites are educational or non-profit or those settings may be default settings if the content owner's usage rules indicate free use by educational or non-profit sites. In response, the content user sends a response agreeing to the terms. The response may be created and/or sent automatically because the content user's compliance rules indicate the following rule: “automatically debit my account up to $100 per year when licensed content is used.” The response may be sent manually, or the user may approve an automatically created response before it is sent.
In some embodiments, a series of communications may occur between the content user and content owner. On the content user and/or the content owner's side, the responses may be automatic. In this way, licensing terms can be negotiated and/or steps can be taken towards resolution.
In some embodiments, compensation is not necessarily monetary. For example, the content owner may just want to receive attribution, license revenue or advertising revenue sharing may be donated to charitable or other causes as directed by the content owner or may be treated as a credit towards a trade (e.g., if you use my content, I can use your content), or the content owner may require that the content and derivative works be presented in a manner that enables tracking of the number of uses or views of the content, or that derivative works must be available for use by others under specified usage rules.
In some embodiments, whenever new controlled content is provided, processes 202-206 are performed. In some embodiments, every prespecified search interval, processes 208-213 are performed. In some embodiments, every prespecified report interval, 214 is performed. For example, an email may be sent to the user indicating that new matches have been found, and a link to the web interface provided in the email message. In some embodiments, 214 is performed each time a user logs into the content monitoring system. In some embodiments, 208-213 are performed when a user logs into the content monitoring system, either automatically, or after a user selects an “update results” or “search” button upon logging in.
In some embodiments, the number of accesses to a controlled content object is tracked. For example, the content is associated with a web beacon or other element of code that enables the tracking of accesses of the content for purposes such as calculation of license fees or revenue sharing.
is a flow chart illustrating an embodiment of a process for monitoring for use of controlled content. In some embodiments, this process is performed when a content host, such as an OSP, is searching or monitoring for non-compliant use of content based on a host policy of the content host. Thus, the controlled content in this case is non-compliant content based on a host policy. In some embodiments, this process is performed by
content monitoring system100.
At 230, a host policy is specified. For example, an OSP may have a policy regarding what comprises non-compliant content. Non-compliant content may include material that violates third party copyrights or trademarks, is illegal (e.g., child pornography) or does not comply with an OSP's terms of service (e.g., adult content, pornography, obscenity). A host policy may include host rules that may be associated with any compliance structure, such as host specific compliance rules, rules against objectionable content (e.g., obscenity, adult content, child pornography), or any legal or personal compliance structure. A host policy may specify that content must comply with usage rules specified by the content owner, such as “copy left,” “share alike,” Creative Commons specified structures, etc.
A variety of user interfaces may be used to specify a host policy. For example, any of the user interfaces described at 203 for specifying usage rules may be used to specify a host policy. For example, a list of terms, checkboxes (to apply a rule), and settings (specific to a rule) may be provided. The list may include, for example: whether pornography is allowed, whether profanity is allowed, whether to comply with one or more usage rules, whether to comply with copyright or other legal structures, etc. The rules may be hierarchical. For example, a list of higher level rules or compliance structures may be displayed for selection, each of which may be expanded to display lower level rules that each of the high level rules comprises. Rules may have any number of levels. Checkboxes (or another appropriate object) may be located next to the higher level or lower level rules and may be selected (e.g., checked off) at any level of granularity.
At 232, content is monitored for use of controlled content. In this case, the monitored content comprises the content hosted by the content host (e.g., the content served by the OSP). In some embodiments, monitoring comprises checking each content object before it is hosted (or served) by the OSP. For example, an OSP such as youtube.com may check each video before it is made available for viewing on youtube.com. In some embodiments, monitoring comprises periodically checking content objects served by the OSP. For example, a new video is made available for viewing immediately after being posted, but the video may later be removed by a monitoring process that checks new content objects. If the video is determined to be non-compliant, it is removed and the video owner is optionally notified. The results of the check are stored in a database so that the video does not need to be checked again unless it is modified.
In some embodiments, if information obtained from the database is not enough to determine whether the content is compliant, an evaluation is performed, where the evaluation can include techniques described at 212. The evaluation may also include techniques used to detect objects or characteristics of objects in an image, such as faces, body parts, the age of a person being depicted, etc. Such techniques may be useful to detect pornography or child pornography, for example. The evaluation results may then be stored in the database.
Examples of monitoring are more fully described below with respect to
FIG. 2D.
In some embodiments, a common pool of objectionable content is maintained based on input from multiple content hosts. For example, the common pool may include content that has been identified by various content hosts as containing pornography, child pornography, profanity, or racial content. Depending on the compliance rules specified in their host policies, an OSP may have an interest in contributing to, sharing, and using the common pool to identify objectionable content and remove or reject it.
For example, an OSP such as eBay may desire to monitor content posted by its users. An eBay employee manually performs simple filtering for adult content. Each time the eBay employee flags an object as “adult content,” that object is acquired by the content monitoring system and becomes part of a common pool of objectionable controlled content.
Content in the objectionable database may also be stored with a certainty rating. For example, the greater number of times the content object has been identified as violating a rule, the greater the certainty rating. In some embodiments, for each content object in the objectionable database, data is maintained regarding each usage/compliance rule that it violates. For example, content object 10034 may be non-compliant with
rules4, 7, and 112, but not other rules. This information may be stored in a table, metadata associated with content object 10034, or in any other appropriate way.
In some embodiments, if the content is being monitored for by a user at 202-213, data from that process may be re-used at 232. For example, similarity, compliance, and other factors may be determined based on data already obtained at 202-213. Additional compliance factors that take into account the host policy may also be determined and used.
At 234, content is reported. In some embodiments, which content to report is configurable and may depend on criteria based on technical factors (e.g., similarity scores), compliance factors (e.g., non-compliance scores), and/or other factors (e.g., reputation information) as described at 214. Content reported may include content determined to be non-compliant based on the host policy. Content reported may also include notices received from content owners who believe the content host is using their content in a non-compliant way.
For example, a web interface may be provided for viewing and managing reported content. In some embodiments, the web interface allows the host to track and manage past and/or pending engagement notices. The web interface includes information about matching content, reputation information, similarity scores, non-compliance scores, link(s) to usage rules associated with the content object, and any other appropriate information. Reputation information could be related to the content owner, e.g., how reputable the content owner is. For example, the content owner may not actually be the content owner, but a scam artist or spammer who has sent thousands of notices. On the other hand, a reputable content owner may have only sent 3 notices in the past year. In some embodiments, reputation is based on ratings by other content users, content hosts, and/or other users of the content monitoring system. For example, content users who have dealt with a particular content owner and felt that he was legitimate may have given him good reputation ratings. In some embodiments, APIs to the content monitoring system are provided to the OSP for managing notices and responding.
At 236, the report is responded to. In some embodiments, an automatic response is sent according to rules set by the OSP. For example, whenever the OSP receives a DMCA notice from a content owner with a reputation rating above a specified value, it automatically takes down the image. In another example, whenever a child pornography match is made with a similarity score above 90 and a non-compliance score above 80, an email is sent to the user and if no response is received within a set period of time, the content is removed. In some embodiments, an OSP administrator manually reviews each content match and selects a response for each content match.
Besides a common pool of objectionable content, various common/collaborative pools of data may be maintained. Other examples of common pools of data include reputation of content owners, reputation of content users, reputation of content hosts, content known to be in the public domain, sites known to copy the most often, etc. These common pools may be contributed to by content owners (e.g., end users), content hosts (e.g., an employee on an OSP's content review team), legal experts, experts in “fair use,” other reputable entities, results from previous detection results (e.g., false positives), etc. APIs or other interfaces may be provided to assist with flagging content for inclusion in these pools. These common pools of data may then be accessed and used during the monitoring process (e.g., during 202-216 or 231-232).
For example, a negation database be maintained that includes content that is known to be in the public domain, content that has expired or lapsed in copyright, and/or content that is difficult to claim ownership of, e.g., because it is common, such as disclaimers and copyright notices. Any content in the negation database is designated as compliant.
is a flow chart illustrating an embodiment of a process for evaluating context of a content object. In some embodiments, this process is used to perform 212 when the context includes compliance information (e.g., compliance factors). Examples of compliance factors include the presence or absence of advertising on a page containing the content object, whether the page contains paid content, etc. In some embodiments, this process is performed by
content monitoring system100. In some embodiments, this process is performed when a content owner is monitoring for use of his content.
At 240, a detected content object associated with use of controlled content is obtained. In some embodiments, the detected content object is detected based on technical factors, as described at 210.
At 242, usage rules associated with the controlled content are obtained. In some embodiments, the usage rules specified by the content owner at 203 are obtained.
At 246, a usage rule is evaluated against the detected content object. The usage rule may be specified at a high level (e.g., do not permit use on for profit sites, permit use on nonprofit sites) or at lower level (e.g., do not permit use on pages containing advertising, offer to license on pages containing paid content, permit use on sites ending with .edu). For example, it is determined whether the page associated with the content object contains advertising, requires a subscription, contains affiliate links, or contains paid content.
At 248, it is determined whether the usage rule is satisfied. If not, one or more scores are adjusted. For example, a non-compliance score may be increased or decreased as appropriate. At 252, it is determined whether there are additional rules to check. If there are additional rules to check, the process returns to 246. If there are no additional rules to check, one or more scores are provided.
is a flow chart illustrating an embodiment of a process for monitoring for use of controlled content. In some embodiments, this process is performed by
content monitoring system100. In some embodiments, this process is performed when a content host, such as an OSP, is checking for non-compliant use of controlled content. For example, this process may be used to perform 232.
At 260, a content object is received. For example, a user is posting a new content object to an OSP site, and the OSP is receiving the content object for the first time. At 262, a fingerprint of the content object is generated. A fingerprint may be generated, feature(s) may be extracted, or other analysis performed, as described at 206. At 264, the fingerprint (or another analysis result) is checked against a database of known non-compliant (or known compliant) content objects. In some embodiments, the database includes a common pool of content that has previously been identified either manually or automatically as non-compliant or compliant. The content can be looked up by fingerprint or any other appropriate index. At 266, it is determined whether the content object is non-compliant according to the database. If it is non-compliant according to the database, the content object is removed at 272. If it is not non-compliant according to the database, then the content object is evaluated at 268. (In some embodiments, if the content is compliant according to the database, then the content object is approved for posting.) Evaluating may include any of the processes described at 212-213 and/or at 240-256. In some embodiments, evaluating includes notifying the content host (e.g., the OSP) and receiving an evaluation of the content object from the content host. For example, the content host may perform a manual or automatic evaluation. The results of or data from the evaluation is stored in the database with the fingerprint. At 270, it is determined whether the content object is non-compliant according to the evaluation. For example, the determination can be made based on technical factors, compliance factors, or other factors, as previously described. If the content object is non-compliant, the content object is removed at 272. If not, the process ends. In some embodiments, if the content object is not non-compliant, then the content object is approved for posting.
is a flow chart illustrating an embodiment of a process for engaging with a user of non-compliant content. In this example, rather than automatically removing a non-compliant content object, the user may be contacted first. In some embodiments, this process is performed by
content monitoring system100. In some embodiments, this process is used to perform 236 when non-compliant content is found. For example, this process may be performed in place of 272. At 280, a content object is determined to be non-compliant. For example, the determination can be made based on technical factors, compliance factors, or other factors, as previously described. At 282, it is determined whether user contact is requested, which may be a configurable setting. In this example, the user refers to the entity that posted the content on the OSP. If user contact is not requested, then the content object is removed. If user contact is requested, then the user is contacted at 284. For example, the user is notified that the user's content has been identified as non-compliant content and to either take down the content, explain why the content is compliant, or cause the content to be compliant (e.g., based on usage rules for the content). At 286, it is determined whether the content object is in compliance. For example, the user is given a set amount of time to respond, and after that time, an evaluation of whether the content object is in compliance is performed. If it is still not in compliance, the content object is removed. In some embodiments, if it is still not in compliance, the user is notified again, or another appropriate action is taken. If the content object is in compliance, the process ends. In some embodiments, if the content object is now in compliance a database is updated to include this information.
is a flow chart illustrating an embodiment of a process for displaying compliance information (e.g., rules) to a content user wishing to use content on a content owner's website (as described at 203). In this example, a content owner has created a web page of his content (e.g., “www.example.com”) and included on the web page a link that is associated with a server that stores compliance information associated with his content. In some embodiments, the link is a common URL, where the common URL is not unique to the content owner or his web page (e.g., “rules.attributor.com”). At 290, the web page is viewed, e.g., by a potential content user. At 292, the “rules.attributor.com” link is selected. For example, the content user is interested in using the content, and would like to know if there are any usage rules associated with it.
A receiving system (e.g., a server that stores or has access to the compliance information) receives the request for “rules.attributor.com” at 296 and determines the appropriate compliance information at 298. In some embodiments, the compliance information is determined by looking up the web page from which the link was selected (e.g., the content owner's web page) in a table (or other appropriate structure) of compliance information. For example, next to “www.example.com” in the table are usage rules associated with content on “www.example.com.” In some embodiments, the table includes information about content objects on the web page and associated usage rules. In some embodiments, the server retrieves the content on web page “www.example.com” and looks up associated compliance information based on the retrieved content information. For example, each content object may have a content object ID or fingerprint that may be used to identify it and look up usage rules associated with it. In some embodiments, both the URL “www.example.com” and information associated with the content object (such as a content object ID) are used to obtain the compliance information.
At 299, a web page with the compliance information is returned. At 294, the web page with the compliance information is viewed. For example, the potential content user views the compliance information and can decide whether to use the content.
is an example of a graphical user interface (GUI) for providing controlled content. In some embodiments, a user uses
GUI300 to specify content to be monitored at 202. As shown, a user can enter a URL or a link to controlled content or upload a file. Any number of content objects can be specified. A username and password to access content can be provided. In some embodiments, a user uses
GUI300 to specify input to
ingestor104 in
FIG. 1.
300 and the other GUIs described herein may vary depending on the embodiment. Which functionality to include and how to present the functionality may vary. For example, which objects (e.g., text, links, input boxes, buttons, etc.) to include and where to place the objects may vary depending on the implementation.
is an example of a GUI for providing controlled content. In some embodiments,
GUI400 opens in response to selecting a link in
GUI300, such as the “Add Content” button. In some embodiments, a user uses
GUI400 to specify content to be monitored at 202. In some embodiments, a user uses
GUI400 to specify input to
ingestor104 in
FIG. 1.
As shown, one or more files may be provided in the “My content” input box. A user can indicate whether the content is a single web page or file or a URL or feed. In the case of the URL or feed, the content includes all existing content plus any new content added in the future. In the “Nickname” input box, the user can specify a nickname for the controlled content. In this way, a user can manage or maintain multiple sets of controlled content using different nicknames.
In some embodiments, a “Sites to watch” input box is provided, in which the user may enter URLs where the user expects the content to appear. For example, the user may currently be aware that a particular site is using the user's content. In some embodiments, the content monitoring system searches the web, but searches the specified sites first or more frequently.
In some embodiments, a “Related Keywords” input box is shown, in which, the user may enter keywords associated with the specified controlled content. For example, if the user expects the content to be found primarily in children's websites, the keywords “kids” and “children” might be included. In some embodiments, the content monitoring system automatically determines keywords (such as unique phrases) to search in addition to the related keywords specified by the user.
In some embodiment, a “Search Scope” input box, is shown, in which the user may specify whether the entire Internet should be searched or only domains specified by the user. In some embodiments, the user may specify to only search sites that copy the most often.
In some embodiments, a “Text” input box is provided, in which text may be entered. The text may be text in the content itself or text associated with the content, such as keywords, tags, depictions of the text (e.g., a photo of a street sign with text), etc. In addition, other search criteria may be specified, including a minimum similarity score, a minimum non-compliance score, a minimum percent of controlled content copied, a minimum percent of text copied, a minimum number of images copied, a minimum percent of match, whether the content is attributed (e.g., to the content owner), whether there is advertising on the page and what type, the minimum number of unique visitors per month, and what types of matches to find (e.g., images only, text only, video only, or combinations, etc.)
is an example of a GUI for providing usage rules. In some embodiments,
GUI402 is included as part of
GUI400. In some embodiments,
GUI402 opens in response to selecting a link in
GUI400, such as a “Specify Rules of Use” link (not shown in GUI 400). In some embodiments, a user uses
GUI402 to specify usage rules associated with the content specified in
GUI400. In some embodiments, a user uses
GUI402 to specify usage rules at 203.
As shown, a list of usage rules may be selected by selecting bullets and checkboxes. The rules listed in this example include: attribution required/not required; commercial use OK, OK if user shares a specified percentage of the revenue, or no commercial use; limit text copies to a specified percentage of the source (controlled) content; no changes may be made to controlled content; contact content owner first for permission; share alike; a specified Creative Commons license; all rights reserved; or public domain.
Graphical icons are displayed next to each usage rule. For example, “$%” indicates that commercial use is okay if the user shares a specified percentage of the revenue. “By” with a slash through it indicates that attribution is not required. “%” indicates that text copied must be limited to a specified percentage of the controlled content.
A similar GUI may be used to specify host rules for a host policy.
is an example of a GUI for displaying search results. In some embodiments,
GUI500 is used to report search results at 214, e.g., to a content owner. In some embodiments,
reporter110 in
FIG. 1reports
results using GUI500.
In the example shown, a content owner owns a collection of photography related content, including images of cameras and text descriptions of cameras. The search results are shown in a grid based layout. In each grid cell, a controlled content object and a match content object are shown, where it has been determined that the match content object is similar to the controlled content object based on a similarity score and a non-compliance score. As shown, in
grid cell502, the controlled image (camera1) and the match image (camera2) have a similarity score of 98 and a non-compliance score of 88. In some embodiments, data displayed includes one or more of the following: similarity score, non-compliance score, URL of the match content object, percent of the controlled object copied, percent of the controlled text copied, the number of controlled images copied, the date found, whether there is advertising on the page, etc. In the case of text, a portion of the copied text is displayed along with the controlled text in
grid cell504.
In some embodiments, rather than or in addition to reporting a score, a binary flagging (e.g., “interesting” or not) is reported. For example, a score that aggregates similarity, non-compliance, and/or other factors into a combined/summary score may be displayed.
In some embodiments, if there is more than one matched content object, then the additional matched content objects are displayed using a 3D graphical effect indicating there are multiple pairs. Using forward and back arrows, the user can cycle through the various pairs. In some embodiments, the pairs are displayed in descending similarity score order.
Various other functionality may be provided in
GUI500. For example, the search results may be filtered and sorted in various ways using the “Showing” and “Sort by” pull down menus. Additional controlled content may be added in the “Controlled Content” input box, an email address may be entered for automatic notification (e.g., when more matches are found) in the “Email Address” input box, etc. Rather than use a grid based layout, other layouts may be used in other embodiments.
In the case of a content host monitoring for use of non-compliant content based on the host policy, an interface similar to
interface500 may be used to display resulting matches. For example,
cell502 may display a match with copyrighted content.
Cell504 may display a match with content associated with child pornography. For example, in place of text1 may be a known image that has been positively identified (either manually or automatically) as child pornography, and in place of text2 may be a new image that is being posted by a user to the content host. In this case, the known image in place of text1 may have been in a database of known non-compliant content, and the match determined as described at 264. In some cases, the new image is determined to be a match with child pornography based on an evaluation (e.g., 268) rather than a match with a content object in a database of known pornography. In this case, in place of text1, there may be no image displayed, or data related to the evaluation may be displayed instead.
is an example of a GUI for displaying use of a content object. In some embodiments,
GUI600 is displayed in response to selecting a “Match” link or the image or text corresponding to a match object in
GUI500.
In the example shown, the portions of the web page that include use of the controlled content are marked, i.e., boxed (e.g., a graphical box around the image or text that is being used). In this example, text1, text3, and photo2 are controlled content objects that are being used on this web page. In various embodiments, various indicators (e.g., visual cues) may be used to indicate the copied portions. Examples of indicators include: highlighting text, changing font appearance (e.g., using bold, underline, different fonts or font sizes, etc.), using different colors, displaying icons or other graphics in the vicinity of the copied portions, using time dependent indicators, such as causing the copied portions to flash, etc.
Various options or functionality may be provided for displaying information related to the use of the controlled content. For example, an archive date (May 31, 2006) may be displayed. Applicable usage rule(s) specified by the content owner may be displayed. In this case, the usage rules are displayed using the icons described with respect to
FIG. 4B. When selecting an icon, details regarding the associated usage rule may be displayed.
In some embodiments, the web page shown is the current version of the web page. In some embodiments, the web page shown is an archived version. For example, the archived version may be stored in monitored
content store118. Whether the web page is the current version or an archived version may be indicated in the GUI. In addition, the user may be able to toggle between the two versions.
In some embodiments, a management GUI may be provided for managing content that provides links to one or more of the GUIs described above. In some embodiments, a user uses the management GUI to manage content, including add new controlled content, modify search parameters, report search results, etc. For example, various tabs may be provided, such as a “My Content” tab used to add/modify controlled content and search parameters and a “Matches” tab used to display search results. In some embodiments, selecting the “Matches” tab opens
GUI500.
A user can group content into categories, such as categories associated with the user's blog, camera reviews, the user's eBay account, and all files. In various embodiments, content may be grouped in folders, by tags, or in any other appropriate way. A list of controlled content (e.g., URLs, paths) associated with the category may be displayed, including the number of content objects associated with the controlled content, when the content objects were last archived (e.g., placed in controlled content store 116), rules associated with each content object, and the number of matches found for the content object(s).
Determination of Originality
is a block diagram illustrating an embodiment of a system for making a determination of originality of content.
System700 provides a determination of originality for one or more content objects. In some embodiments,
system700 provides a determination of originality for all registered content objects. In some embodiments,
system700 provides a determination of originality for a content object in response to a request. For example,
system700 may be used in a content clearinghouse system. A determination of originality may be useful in a variety of applications, including the verification of originality for the purposes of determining the priority of listings in search and match results in search engines, licensing of the content, and participation of publishers in contextual advertising networks, as more fully described below.
As used herein, an original content object includes an instance of a content object that is available at a location (such as a URL) that is served by or on behalf of either the original author or creator of the content or by a party authorized to present the content. As such, there may be more than one “original” of any unique content object. In many but not all cases the “original” may appear at the location where the unique content object was first observed by any automated crawler. A derivative version of a content object may be non-identical to an original version of a content object.
In the examples described herein, for purposes of explanation, it may be assumed that there is only one original content object or one Deemed Original, as more fully described below. However, it should be understood that in many cases, there are multiple original content objects or Deemed Originals.
In the example shown,
content object704, originality factors 712, and matching content (if any) are provided as input to
originality analysis block702. In some embodiments,
content object704 is one of a plurality of content objects that is provided as input to
originality analysis block702. For example, a crawler such as crawler 112 (
FIG. 1), may crawl the Internet to catalog which content is original (and which is not). In other embodiments, users designate content sources for capture and comparison.
In some embodiments, the matching content is provided by a
matching engine706, such as matching engine 120 (
FIG. 1). The matching content is content that matches
content object704 based on criteria such as similarity scores and non-compliance scores, as described above. Depending on
content object704, there may or may not be matching content.
702 analyzes
content object704, originality factors 712, and matching content. Originality factors 712 include any factors that may affect the originality of the content object. Examples of
originality factors712 include: whether the originality of the content object has been challenged by another user, whether the claimed owner of the content object is registered or authenticated, and any third party verification of an originality factor—such as, where a third party content host presents the content with an indication that the user has a paid subscription to the hosting service, which may indicate that the user is not anonymous and therefore more likely to be the claimed rights holder. Besides originality factors 712, other originality factors may be analyzed. For example, other originality factors may be derived from content object 704 (or the matching content), such as whether
content object704 is a subset or superset of another content object and the presence or absence of attribution. Originality factors are described more fully below.
In some embodiments,
originality analysis block702 determines an originality score for
content object704 and each matching content object.
Originality determiner708 makes a determination of which of the content objects, if any, are original content objects. An
originality determination710 is provided as output. In some embodiments, if the content object is determined to be original, the content object is identified as a “Deemed Original”. In some embodiments, none of the content objects are deemed original.
Originality determination710 may be made based on a variety of rules and do not necessarily result in the actual original content object. For example, in some cases,
originality determiner708 selects the content objects that are the most likely to be an original, e.g., based on one or more rules.
In some embodiments, “Deemed Original” status is published in a Usage Rules Summary associated with each registered content page, as a web service, available to third parties such as search engines, in a visible badge that may be coded by users into their registered content pages and feeds, on match results pages when the match query includes the registered content, and/or as part of publisher reputation information that is provided to hosts with remedy requests.
Content owners may benefit from Deemed Original status because “Deemed Original” status provides an originality verification for licensing and revenue sharing transactions involving the registered content. Presentation on match results pages allows potential content licensees to find rights holders more easily. Originality scores can be used for ranking search engine results, filtering search spam and claiming rights to contextual ad revenue, as described more fully below. Individual authors and creators may take pride in the distinction of having content that is a Deemed Original.
In some embodiments, for non-paying users, Deemed Original status for any content object is noted in appropriate parts of their interface. A user must subscribe (and become a paying user) to have their Deemed Original Status publicly available.
In some embodiments, upon initial designation of a content source for monitoring, the user is provided with a link to an inventory of all content objects that are Deemed Originals. This may provide an opportunity to communicate immediate benefits of subscription at the time of registration, even prior to identification of any actionable matches.
In some embodiments, a Source Detail Page includes visual cues (such as highlighting) to indicate Deemed Original content objects. This view also is publicly available from a Usage Rules Summary for the related page.
is a flowchart illustrating an embodiment of a process for performing an originality determination. This process may be implemented on
system700, for example. In the example shown, at 802, a content object is received. For example, a request for a determination of originality of a content object is received. The request may be made by a content owner, a content host, a content user, or a crawler. For example, the request may be made by a user who is interested in using the content object and would like to know who owns the content object. In another example, a content host may request a determination of originality so that it can provide a visual indication of originality when displaying a content object. For example, search engine results may display search results associated with original content objects above other search results.
At 804, an originality determination is made. Determining originality includes analyzing one or more originality factors related to the content object and determining automatically whether the content object is original based on the analysis of the originality factors. Examples of originality factors are more fully described below. At 808, the determination is output. In some embodiments, the determination is displayed in a user interface. The user interface may display a visual indication of whether the content object is a deemed original. In some embodiments, the originality score is displayed or accessible to the user. In some embodiments, the originality score is hidden from the user.
In some embodiments, the originality determination for a content object has already been made and is stored (e.g., as metadata for the content object, as described below). In this case, the determination is retrieved at 804.
is a flowchart illustrating an embodiment of a process for making an originality determination. This process may be implemented by
system700 and may be used to perform 804. In the example shown, at 904, it is determined whether there are any matching content objects. If it is determined that there are no matching content objects, then an originality score is computed for the content object at 902. At 906, the content object is designated as a deemed original based on the originality score for the content object. For example, a threshold may be specified such that if the originality score for the content object is above the threshold, the content object is designated as a deemed original.
If it is determined that there are matching content objects, then at 908, an originality score is computed for the content object and the matching content objects. At 910, the content object corresponding to the highest score (of the original and the matching content objects) is designated as a deemed original. In the case in which two or more content objects have matching or similar originality scores, in some embodiments, the content object corresponding to the earliest time of first appearance is selected. In some embodiments, rather than or in addition to outputting whether the content object is a deemed original, the originality score is outputted. As previously described, there may be more than one instance of an original content object. In some embodiments, if two or more content objects have matching or similar originality scores and/or are above a threshold, then these content objects are all Deemed Originals.
In some embodiments, the originality score is stored, for example, as metadata for the content object. In some embodiments, the originality score for all the content objects is stored, which may be desirable so that calculating the scores does not need to be repeated at a future time. In some embodiments, an indication of whether a content object is deemed original is stored. In some embodiments, the content objects that are not deemed original are deemed unoriginal, and this is stored. For example, a content object may be deemed original, deemed unoriginal, or not yet tested for originality.
is a flowchart illustrating an embodiment of a process for computing an originality score for a content object. In some embodiments, this process is used to perform 902 and/or 908. This process may be implemented on
system700. In the example shown, the process begins at 1002, at which it is determined whether the content object is registered. A content object may be registered by the owner of the content object. In this example, a content object must be a deemed original in order to be registered. Thus, if it is determined that the content object is registered at 1002, then at 1004, an indication of registration is outputted.
Otherwise, if it is determined that the content object is not registered, then at 1006, originality factors related to a host and/or claimed owner associated with the content object are analyzed.
If the content object is hosted, then originality factors related to the host may be analyzed. For example, if a user browsing the web comes across a content object hosted by a content host, the user may request that an originality determination be made of the content object. Originality factors related to the content host may then be analyzed, including, for example, whether the content host is registered or is legally bound, as more fully described below.
If the content object has a claimed owner, then the claimed owner may be analyzed. For example, a content owner may upload a content object and request to register the content object. Before the content object can be registered to the content owner, an originality determination is made of the content object. Originality factors related to the content owner may then be analyzed, including, for example, whether the content owner is registered or is legally bound, as more fully described below.
The content object may both have a claimed owner and be hosted. For example, a professional photographer's photo may be displayed on a news website at URL X. The photographer may then request to register the photo located at X. In response, an originality determination is made of the photo, including analyzing both the content owner (i.e., the photographer) and the content host (i.e., the news website). If, for example, the news website is registered and legally bound, then the content object would have a higher originality score.
Examples of analyzing originality factors related to a host and/or content owner are described below with respect to
FIG. 11.
At 1008, other originality factors are analyzed. Other originality factors include, for example, historical information associated with the content object, attribution associated with the content object, and the quality of the content object. Examples of analyzing other originality factors are described below with respect to
FIG. 12.
At 1010, an originality score is computed based on the analysis at 1006 and/or 1008. For example, if the claimed owner is registered, then the originality score is higher. If there is an attribution to the content object, then the originality score is higher. Further examples are provided below. At 1012, the score is outputted.
is a flowchart illustrating an embodiment of a process for analyzing originality factors related to the host and/or claimed owner of a content object. For example, this process may be used to perform 1006. This process may be implemented by
system700.
In the example shown, the process begins at 1101, at which the reputation of the host or claimed owner is checked. In some embodiments, reputation is based on past behavior of a host or owner. For example, a host that always attributes content has a higher reputation than a host that does not attribute content. A content owner whose ownership is frequently challenged has a lower reputation than a content owner whose ownership has never been challenged. Reputation may be with respect to one or multiple systems. In some embodiments, a higher reputation corresponds to a higher originality score. Reputation is more fully described below with respect to
FIG. 13.
At 1102, it is determined whether the host or claimed owner is registered. In some embodiments, content hosts are registered as hosts and content owners are registered as owners. In some embodiments, hosts and owners are registered as the same entity. In some embodiments, registration comprises providing identification (e.g., name, email address, residence address, social security number, credit card number, etc.). In some embodiments, the more identification provided, the higher the originality score.
In this example, the host or owner must be registered in order to be authenticated and must be authenticated in order to be legally bound. (Other embodiments may vary.) Therefore, if the host or owner is not registered, then the result that the host or owner is not registered is output at 1108. If the host or owner is registered, then it is determined whether the host or owner is authenticated at 1104. In some embodiments, authentication comprises verification of identification (e.g., verifying that a credit card number is valid). In some embodiments, the user is a paying subscriber, whose identity may be authenticated through a credit card transaction, identity verification service, direct enterprise contract, or other identify verification method.
If the host or owner is not authenticated, then the results (from 1102 and 1104) are output at 1108. If the host or owner is authenticated, then it is determined whether the host or owner is legally bound at 1106. In some embodiments, a host or owner is legally bound if the host or owner has agreed to comply with one or more legal requirements. For example, the user has agreed to specified obligations or penalties for ownership claims made in bad faith. The results (from 1102, 1104, and 1106) are output at 1108.
There may be inconsistent indication of originality of a content object, such as conflicting claims of ownership. For example, a content owner attempts to register a content object and during the originality determination process, it is found that another content owner has already registered the content object. In this case, the ownership of the content object may be challenged. In some embodiments, if there is a challenge to the claim for originality, then this may lower the originality score or be recorded separately (e.g., rather than designating a content object a Deemed Original, it may be deemed a Qualified Original).
is a flowchart illustrating an embodiment of a process for analyzing originality factors. For example, this process may be used to perform 1008. This process may be implemented by
system700.
In the example shown, the process begins at 1202, at which the extent of duplication is analyzed. For example, if it is determined that there are matching content objects, then the number of matching content objects is used to compute the originality score for the content object. In some embodiments, the greater the number of matching content objects, the lower the originality score. In some embodiments, the number of matching content objects is used in combination with other factors, such as the number of matches that provide attribution versus the number of matches overall; the number of matches that pre-date versus post-date the first known instance of the subject content; and the number of pre-dating matches that contain attribution from other third party parties.
At 1204, attribution associated with the content object is analyzed. Attribution may be direct or indirect attribution. An example of direct attribution is cnn.com attributing a news article to a media outlet, for example, by placing the name of the media outlet and/or a link to the media outlet's web location near the article. An example of indirect attribution is a blog attributing the news report to cnn.com. In some embodiments, the more direct or indirect attribution to a particular content object, the higher the originality score of the content object. Attribution may be determined using information extraction, natural language processing, and/or analysis of links (e.g., URLs).
At 1208, similarity to matches is analyzed, where similarity includes whether the content object is a subset of a match, a superset of a match, or identical to a match. For example, a content object may comprise one paragraph from a news article (subset) or a content object may comprise the news article plus additional content (superset). Percentages or other measures may be computed to quantify the similarity. In some embodiments, if the content object is a subset of a match, then the originality score is lower. If the content object is a superset of a match, then the originality score is higher.
At 1210, a time associated with the content object is determined. For example, the time could be a time or time stamp when the content object first appeared on a content host. In some embodiments, the time is obtained using the Internet Archive. In some embodiments, the earlier the time, the higher the originality score. In various embodiments, a time associated with the content, matching content, or attributing content is determined.
At 1212, the results (from 1202-1210) are output. This flowchart is an example of various originality factors that may be analyzed. In various embodiments, various other originality factors may be analyzed. For example, the quality of the content object may be analyzed. An example of quality is the resolution of an image. In some embodiments, the higher the quality of a content object, the higher the originality score.
is a block diagram illustrating an example of originality factors related to the reputation of a host. For example, these factors may be analyzed at 1204. In the example shown,
web location1300 is hosted by a content host. A web location may include a web page, feed, or any other method of transmission.
Web location1300 includes content objects 1−N. n1 is the number of attributions to
content object1. Attributions may be in the form of text or links. n2 is the number of attributions to content object N. n3 is the number of attributions to
web location1300. n4 is the number of attributions from
web location1300 to other web locations, where the attributions are used to attribute content on
web location1300. In some embodiments, the greater n1, n2, n3, and/or n4, the greater the reputation of the host. Stated another way, if there are many attributions to web location(s) of a host (n3) or to content objects on web location(s) of a host (n1, n2), then the host's reputation goes up. Similarly, if there are many attributions from web location(s) of the host to other sources (n4), then the host's reputation goes up. In other words, a host that tends to attribute content has a good reputation. n1, n2, and n3 may include direct and/or indirect attribution. In some embodiments, direct attribution is weighted differently or more heavily. In some embodiments, since a content owner may be associated with a host, the content owner's reputation is based on the host's reputation and vice versa.
In some embodiments, reputation is based not just on the tendency to attribute content, but also the tendency to attribute content properly or consistently with instances of attribution to the same source provided by other properties. In other words, in some embodiments, improper attribution does not necessarily increase the positive weight of the reputation.
Other factors that may affect a host or owner's reputation is the number of times or frequency that ownership of content by the host or owner is challenged. In some embodiments, the fewer times the host or owner has not been challenged before, the higher its reputation.
In some embodiments, a challenge to a claim of ownership of content that has been designated as a “Deemed Original” changes its designation to a lower level of authentication (such as “Qualified Original”)
In some embodiments, a web page or other collection of one or more content objects is given a designation to indicate that each of the content objects on the Web page or in the collection is either: (1) original or (2) non-original and properly attributed.
is a flowchart illustrating an example usage of a system for determining originality, such as
system700. In this example, a search engine displays search results based on the originality of the search results. For example, a search engine may use an API that returns originality determination results for content objects. In some embodiments, a search engine may perform the originality determination.
In the example shown, at 1402, content is searched. At 1404, for each resulting content object in the search results, an originality determination is obtained. For example, the process of
FIG. 8is performed for each content object. At 1406, the search results are displayed based on results of the originality determination. For example, the search results may be displayed in an order that takes into account the originality of content in the search results. Search results with original content would be displayed higher than search results with less original content (e.g., search spam). For example, search results could be sorted and displayed based at least in part on originality scores associated with content in the search results. Alternatively, search results with original or unoriginal content could be flagged or some other visual cue used. In some embodiments, matches of Deemed Originals or content objects with higher originality scores are treated as higher priority in sorting of match results (since they are potentially of more interest to searchers), and/or may be provided as a more advanced search sorting filter. In some embodiments, content is included or excluded from search results based at least in part on the originality determination.
In some embodiments, content is presented with an indication of originality, which indication reflects a query, which may be real-time, to a third-party system. In some embodiments, this originality determination accounts for the presence of non-original content which does not lower the originality score so long as the non-original content is properly attributed, for example to a third party source(s).
Originality determination information may be used in a plurality of other applications. In some embodiments, an advertising network uses the determination to ensure that advertising revenue or other benefit is paid only to authorized persons, or that advertising revenue or other benefit is paid in proper amounts to one or more authorized persons. For example, an advertising system (e.g., Google AdSense) may use originality determination information for screening purposes. In some embodiments, the advertising system does not provide advertising revenue to websites that do not provide original content. This prevents search spam sites from receiving advertising revenue.
Challenges to Ownership
In some cases, third party users may challenge a registered user's claim of ownership to a content object and their Deemed Original status. In some embodiments, any challenger must have registered the related content and provided identity verification. The system does not need to make any judgment in such a situation, but may capture and publish information from each party along with other relevant information (such as time of first appearance).
A challenger may be required show good faith by agreeing to specified dispute resolution rules. These rules could include legal fee shifting (loser pays attorneys' fees), liquidated damages (e.g., a fixed minimum amount), and/or arbitration (e.g., rapid, non-appealable resolution of the dispute). In the event that the challenged user declines any reciprocal agreement, “Deemed Original” status may be revoked. In some embodiments, whether and how the challenging user can thereafter acquire “Deemed Original” status is configurable.
In some embodiments, the “Deemed Original” process uses historical search results from an archive, such as the Internet Archive. The Internet Archive could be queried and results cached at the time of registered content capture. Where available, challenge participants can be encouraged to provide Archive references in their dispute statements.
In some embodiments, a separate terms-of-service disclosure and acceptance process is required to activate the publication of Deemed Original status. For example, the disclosure may highlight the dispute resolution terms, and any uncertainties associated with issues like enforceability and conflict of laws.
In some embodiments, the terms of service provide that multiple challenges to a user's Deemed Original status may result in account termination. Such conditions may trigger human review by a staff member.
Although the foregoing embodiments have been described in some detail for purposes of clarity of understanding, the invention is not limited to the details provided. There are many alternative ways of implementing the invention. The disclosed embodiments are illustrative and not restrictive.
Claims (31)
1. A system comprising:
an input to receive a content object posted to a web-based service, for distribution by the web-based service to the public;
a hardware processing unit configured for:
determining fingerprint data from a received content object, and by reference to the determined fingerprint data, determining that the received content object includes content that at least potentially matches controlled content;
obtaining usage rule data relating to the controlled content, the usage rule data having earlier been established by an owner of the controlled content, the usage rule data comprising: i) an owner-specified extent of copying between the received content object and the controlled content, and ii) an attribution requirement for received content;
determining an extent of copying between the received content object and the controlled content;
determining whether the received content object meets a minimum percentage of content; and
governing distribution of the content object based on the usage rule data including i) the owner-specified extent of copying between the received content object and the controlled content, and ii) the attribution requirement for received content, and on the determined extent of copying, when the received content object meets the minimum percentage of content; and
an interface through which different owners can specify different extent of copying requirements to govern distribution of their respective content.
2. The system of
claim 1in which said hardware processing unit is configured for determining an extent of copying that depends, in part, on a percentage of the controlled content that is included in the received content object.
3. The system of
claim 1in which said hardware processing unit is configured for governing distribution of content by:
for a first item of received content, preventing distribution of the first item of the received content from the web-based service; and
for a second item of received content, allowing distribution of the second item of received content from the web-based service.
4. The system of
claim 1further comprising a graphical user interface, said graphical user interface configured for presenting a representation of the received content object and a representation of the controlled content, and configured for presenting a copying extent score.
5. The system of
claim 1further comprising memory, and in which said hardware processing unit is configured for:
providing a user interface to upload the controlled content, or to specify a link to the controlled content;
receiving the controlled content, through said user interface; and
determining fingerprint data from the received controlled content and storing said determined fingerprint data in said memory.
6. The system of
claim 1further comprising a results user interface, the results user interface configured for presenting the received content object and options of confirming a match or rejecting a match.
7. The system of
claim 1in which said hardware processing unit is configured for determining fingerprint data from received content object by processing wavelet data.
8. The system of
claim 1in which said hardware processing unit comprises a server.
9. The system of
claim 1in which said hardware processing unit comprises a processor.
10. The system of
claim 1in which said input to receive the content object comprises a graphical user interface.
11. The system of
claim 1in which said interface through which different copyright owners can specify different extent of copying requirements to govern distribution of their respective content comprises a graphical user interface.
12. The system of
claim 1in which the minimum percentage of content comprises a percentage of content relative to the controlled content.
13. A system comprising:
means for receiving a content object posted to a web-based service, for re-distribution by the web-based service to the public;
means for determining fingerprint data from a received content object;
means for determining, by reference to determined fingerprint data, that the received content object includes content that at least partially matches controlled content;
a first graphical user interface through which owner specified inputs can be received by said system, said first graphical user interface comprising a first interface, a second interface, a third interface, and a fourth interface, in which the first interface is configured to receive geographical requirements, the second interface is configured to receive restrictions on an amount of allowed duplication between the received content object and the controlled content, the third interface is configured to receive use restrictions, and the fourth interface is configured to receive an attribution requirement;
means for determining an amount of duplication between the received content object and the controlled content;
means for governing re-distribution of the content object based on: i) a determined amount of duplication between the received content object and the controlled content, ii) owner specified geographical requirements, iii) owner specified restrictions on an amount of allowed duplication between the received content object and the controlled content, iv) owner specified use restrictions, and v) an owner specified attribution requirement; and
a second graphical interface comprising a match interface, the match interface configured for presenting graphics corresponding to the received content object, and configured for receiving input for confirming a match, the second graphical interface comprising a content location interface configured for providing access to the received content object.
14. The system of
claim 13further comprising an interface through which different owners can specify different amounts of allowed duplication requirements to govern distribution of their respective content.
15. The system of
claim 13in which said means for determining an amount of duplication between the received content object and the controlled content utilizes a percentage of the controlled content that is included in the received content object.
16. The system of
claim 13further comprising a results graphical user interface, said results graphical user interface configured for presenting a representation of the received content object and a representation of the controlled content, and configured for presenting the determined amount of duplication.
17. The system of
claim 13further comprising a memory, a user interface configured to receive the controlled content or a link to the controlled content, and an electronic processor configured for determining fingerprint data from received controlled content and storing said determined fingerprint data in said memory.
18. A system comprising:
an interface for receiving content posted to a web-based service, for distribution by the web-based service to the public;
a hardware processing unit configured for fingerprinting received content, and by reference to the computed fingerprint data, identifying copyrighted content included in the received content;
a first user interface configured for obtaining owner-specified extent of copying requirements, an owner-specified attribution requirement and owner-specified geographical requirements, all of which pertain to the identified copyrighted content;
a hardware processing unit configured for controlling distribution of the received content from the web-based service, in accordance with the owner-specified extent of copying requirements, owner-specified geographical requirements, and a determined extent of copying between the received content and the identified copyrighted content;
a second user interface configured for: i) with use of imagery, presenting for review one or more items of user-posted content hosted by said web-based service that are identified as a match with the identified copyrighted content; and ii) receiving, via the second user interface for use in confirming the match, information regarding the match; and
a third interface for distributing the received content from the web-based service along with an attribution associated with the identified copyrighted content.
19. The system of
claim 18further comprising a processor configured for determining an extent of copying between the received content and the identified copyrighted content.
20. The system of
claim 18in which the identified copyrighted content comprises a part of a copyrighted work.
21. The system of
claim 18in which the identified copyrighted content comprises a whole of a copyrighted work.
22. The system of
claim 18in which said hardware processing unit that is configured for fingerprinting received content comprises a server.
23. The system of
claim 18in which said hardware processing unit that is configured for controlling distribution of the received content comprises a processor.
24. The system of
claim 18in which said hardware processing unit that is configured for fingerprinting received content comprises a processor.
25. The system of
claim 18in which said hardware processing unit that is configured for controlling distribution of the received content comprises a server.
26. A system comprising:
means for receiving content posted by a user to a web-based service, for distribution by said web-based service to the public, the received content including copyrighted content for which copyright is held by a copyright owner;
means for computing fingerprint data from the received content;
means for detecting and identifying, by reference to computed fingerprint data, the copyrighted content included in the received content;
means for determining a copy extent score between the received content and the copyrighted content; and
a first user interface that enables the copyright owner to specify an attribution requirement and a copy extent score for the copyrighted content, in which different copyright owners can specify different copy extent scores for their respective content; and
means for governing distribution of the received content to the public in accordance with whether the determined copy extent score exceeds the specified copy extent score, and if the received content meets a minimum percentage of content; and
a second user interface for distributing the received content from the web-based service along with an attribution associated with the identified copyrighted content.
27. The system of
claim 26in which the received content comprises received video content, and in which said first user interface is configured for receiving owner-specified geographical requirements, and in which means for governing distribution of the received content to the public governs in accordance with whether the determined copy extent score exceeds said specified copy extent score and in accordance with owner-specified geographical requirements.
28. The system of
claim 26further comprising a third user interface configured for presenting the received content and the copyrighted content for comparison.
29. The system of
claim 28in which the third user interface is configured to present the determined copying extent score.
30. The system of
claim 26in which said means for determining a copy extent score between the received content and the copyrighted content determines an extent of copying based, in part, on a percentage of the copyrighted content that is included in the received content.
31. The system of
claim 26in which the minimum percentage of content comprises a percentage of content relative to the copyrighted content.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/541,422 US9436810B2 (en) | 2006-08-29 | 2014-11-14 | Determination of copied content, including attribution |
Applications Claiming Priority (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/512,067 US8738749B2 (en) | 2006-08-29 | 2006-08-29 | Content monitoring and host compliance evaluation |
US11/655,748 US8707459B2 (en) | 2007-01-19 | 2007-01-19 | Determination of originality of content |
US14/258,633 US9654447B2 (en) | 2006-08-29 | 2014-04-22 | Customized handling of copied content based on owner-specified similarity thresholds |
US14/271,297 US8935745B2 (en) | 2006-08-29 | 2014-05-06 | Determination of originality of content |
US14/541,422 US9436810B2 (en) | 2006-08-29 | 2014-11-14 | Determination of copied content, including attribution |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/271,297 Continuation US8935745B2 (en) | 2006-08-29 | 2014-05-06 | Determination of originality of content |
Publications (2)
Publication Number | Publication Date |
---|---|
US20150074833A1 US20150074833A1 (en) | 2015-03-12 |
US9436810B2 true US9436810B2 (en) | 2016-09-06 |
Family
ID=39636314
Family Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/655,748 Active 2030-05-24 US8707459B2 (en) | 2006-08-29 | 2007-01-19 | Determination of originality of content |
US14/271,297 Active US8935745B2 (en) | 2006-08-29 | 2014-05-06 | Determination of originality of content |
US14/541,422 Active US9436810B2 (en) | 2006-08-29 | 2014-11-14 | Determination of copied content, including attribution |
Family Applications Before (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/655,748 Active 2030-05-24 US8707459B2 (en) | 2006-08-29 | 2007-01-19 | Determination of originality of content |
US14/271,297 Active US8935745B2 (en) | 2006-08-29 | 2014-05-06 | Determination of originality of content |
Country Status (2)
Country | Link |
---|---|
US (3) | US8707459B2 (en) |
WO (1) | WO2008088888A1 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160196342A1 (en) * | 2015-01-06 | 2016-07-07 | Inha-Industry Partnership | Plagiarism Document Detection System Based on Synonym Dictionary and Automatic Reference Citation Mark Attaching System |
US10735381B2 (en) | 2006-08-29 | 2020-08-04 | Attributor Corporation | Customized handling of copied content based on owner-specified similarity thresholds |
US12141882B2 (en) | 2019-11-19 | 2024-11-12 | Google Llc | Methods, systems, and media for rights management of embedded sound recordings using composition clustering |
Families Citing this family (78)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2006327157B2 (en) * | 2005-12-20 | 2013-03-07 | Arbitron Inc. | Methods and systems for conducting research operations |
US20070162761A1 (en) | 2005-12-23 | 2007-07-12 | Davis Bruce L | Methods and Systems to Help Detect Identity Fraud |
US8707459B2 (en) | 2007-01-19 | 2014-04-22 | Digimarc Corporation | Determination of originality of content |
US8738749B2 (en) * | 2006-08-29 | 2014-05-27 | Digimarc Corporation | Content monitoring and host compliance evaluation |
US8010511B2 (en) | 2006-08-29 | 2011-08-30 | Attributor Corporation | Content monitoring and compliance enforcement |
US8301658B2 (en) | 2006-11-03 | 2012-10-30 | Google Inc. | Site directed management of audio components of uploaded video files |
US9179200B2 (en) * | 2007-03-14 | 2015-11-03 | Digimarc Corporation | Method and system for determining content treatment |
US10242415B2 (en) | 2006-12-20 | 2019-03-26 | Digimarc Corporation | Method and system for determining content treatment |
EP2156386A4 (en) * | 2007-05-03 | 2012-05-02 | Google Inc | Monetization of digital content contributions |
US8094872B1 (en) * | 2007-05-09 | 2012-01-10 | Google Inc. | Three-dimensional wavelet based video fingerprinting |
US7912894B2 (en) * | 2007-05-15 | 2011-03-22 | Adams Phillip M | Computerized, copy-detection and discrimination apparatus and method |
US20080288509A1 (en) * | 2007-05-16 | 2008-11-20 | Google Inc. | Duplicate content search |
US8611422B1 (en) | 2007-06-19 | 2013-12-17 | Google Inc. | Endpoint based video fingerprinting |
KR20090000898A (en) * | 2007-06-28 | 2009-01-08 | 엘지전자 주식회사 | Method and apparatus for creation and operation of copyrighted user handwritten works |
US8490206B1 (en) * | 2007-09-28 | 2013-07-16 | Time Warner, Inc. | Apparatuses, methods and systems for reputation/content tracking and management |
US8392604B2 (en) * | 2007-10-09 | 2013-03-05 | Yahoo! Inc. | Peer to peer browser content caching |
US8874565B1 (en) * | 2007-12-31 | 2014-10-28 | Google Inc. | Detection of proxy pad sites |
US20090241165A1 (en) * | 2008-03-19 | 2009-09-24 | Verizon Business Network Service, Inc. | Compliance policy management systems and methods |
IES20080215A2 (en) * | 2008-03-20 | 2008-10-15 | New Bay Res Ltd | Access rights for digital objects |
US9189628B2 (en) * | 2008-04-10 | 2015-11-17 | Adobe Systems Incorporated | Data driven system for responding to security vulnerability |
US10049414B2 (en) * | 2008-05-01 | 2018-08-14 | Google Llc | Automated media rights detection |
US20100031364A1 (en) * | 2008-08-04 | 2010-02-04 | Creative Technology Ltd | Method for creating a verifiable media object, a corresponding system thereof, and a verification package for a media object |
IT1391936B1 (en) * | 2008-10-20 | 2012-02-02 | Facility Italia S R L | METHOD OF SEARCHING FOR MULTIMEDIA CONTENT IN THE INTERNET. |
US9058402B2 (en) * | 2012-05-29 | 2015-06-16 | Limelight Networks, Inc. | Chronological-progression access prioritization |
US8175617B2 (en) | 2009-10-28 | 2012-05-08 | Digimarc Corporation | Sensor-based mobile search, related methods and systems |
US8121618B2 (en) | 2009-10-28 | 2012-02-21 | Digimarc Corporation | Intuitive computing methods and systems |
US9886681B2 (en) * | 2009-11-24 | 2018-02-06 | International Business Machines Corporation | Creating an aggregate report of a presence of a user on a network |
US10339575B2 (en) * | 2010-03-05 | 2019-07-02 | International Business Machines Corporation | Method and system for provenance tracking in software ecosystems |
MX2013002671A (en) * | 2010-09-10 | 2013-07-29 | Atg Advanced Swiss Technology Group Ag | Method for finding and digitally evaluating illegal image material. |
US9552442B2 (en) | 2010-10-21 | 2017-01-24 | International Business Machines Corporation | Visual meme tracking for social media analysis |
US8798400B2 (en) | 2010-10-21 | 2014-08-05 | International Business Machines Corporation | Using near-duplicate video frames to analyze, classify, track, and visualize evolution and fitness of videos |
US20120173701A1 (en) * | 2010-12-30 | 2012-07-05 | Arbitron Inc. | Matching techniques for cross-platform monitoring and information |
GB2490490A (en) | 2011-04-28 | 2012-11-07 | Nds Ltd | Encoding natural-language text and detecting plagiarism |
WO2012163735A1 (en) * | 2011-05-31 | 2012-12-06 | Restorm Ag | System and method for online, interactive licensing and sales of art works |
US9015118B2 (en) | 2011-07-15 | 2015-04-21 | International Business Machines Corporation | Determining and presenting provenance and lineage for content in a content management system |
US9286334B2 (en) | 2011-07-15 | 2016-03-15 | International Business Machines Corporation | Versioning of metadata, including presentation of provenance and lineage for versioned metadata |
US9384193B2 (en) | 2011-07-15 | 2016-07-05 | International Business Machines Corporation | Use and enforcement of provenance and lineage constraints |
US9691068B1 (en) * | 2011-12-15 | 2017-06-27 | Amazon Technologies, Inc. | Public-domain analyzer |
US9361377B1 (en) * | 2012-01-06 | 2016-06-07 | Amazon Technologies, Inc. | Classifier for classifying digital items |
US8909628B1 (en) | 2012-01-24 | 2014-12-09 | Google Inc. | Detecting content scraping |
US9418065B2 (en) | 2012-01-26 | 2016-08-16 | International Business Machines Corporation | Tracking changes related to a collection of documents |
US8953836B1 (en) * | 2012-01-31 | 2015-02-10 | Google Inc. | Real-time duplicate detection for uploaded videos |
US8990951B1 (en) | 2012-03-30 | 2015-03-24 | Google Inc. | Claiming delayed live reference streams |
US8799175B2 (en) * | 2012-04-24 | 2014-08-05 | Steven C. Sereboff | Automated intellectual property licensing |
US10134134B2 (en) | 2012-05-24 | 2018-11-20 | Qatar Foundation | Method and system for creating depth signatures |
GB201210702D0 (en) | 2012-06-15 | 2012-08-01 | Qatar Foundation | A system and method to store video fingerprints on distributed nodes in cloud systems |
CN102779176A (en) * | 2012-06-27 | 2012-11-14 | 北京奇虎科技有限公司 | System and method for key word filtering |
US20140052647A1 (en) * | 2012-08-17 | 2014-02-20 | Truth Seal Corporation | System and Method for Promoting Truth in Public Discourse |
US9519685B1 (en) * | 2012-08-30 | 2016-12-13 | deviantArt, Inc. | Tag selection, clustering, and recommendation for content hosting services |
US9971882B2 (en) | 2012-09-24 | 2018-05-15 | Qatar Foundation | System and method for multimedia content protection on cloud infrastructures |
US11429651B2 (en) | 2013-03-14 | 2022-08-30 | International Business Machines Corporation | Document provenance scoring based on changes between document versions |
US9336362B2 (en) * | 2013-04-08 | 2016-05-10 | Microsoft Technology Licensing, Llc | Remote installation of digital content |
US11321775B2 (en) * | 2013-06-27 | 2022-05-03 | Euroclear Sa/Nv | Asset inventory system |
JP6188490B2 (en) * | 2013-08-28 | 2017-08-30 | キヤノン株式会社 | Image display apparatus, control method, and computer program |
US9529840B1 (en) | 2014-01-14 | 2016-12-27 | Google Inc. | Real-time duplicate detection of videos in a massive video sharing system |
US9311639B2 (en) | 2014-02-11 | 2016-04-12 | Digimarc Corporation | Methods, apparatus and arrangements for device to device communication |
US9665614B2 (en) | 2014-03-25 | 2017-05-30 | Google Inc. | Preventing abuse in content sharing system |
US9876798B1 (en) * | 2014-03-31 | 2018-01-23 | Google Llc | Replacing unauthorized media items with authorized media items across platforms |
US9680898B2 (en) * | 2014-09-15 | 2017-06-13 | Sony Corporation | Comment link to streaming media |
US20160239675A1 (en) * | 2015-02-17 | 2016-08-18 | Joshua D. Tobkin | System and method for permission based digital content syndication, monetization, and licensing with access control by the copyright holder |
US10091296B2 (en) | 2015-04-17 | 2018-10-02 | Dropbox, Inc. | Collection folder for collecting file submissions |
US9692826B2 (en) | 2015-04-17 | 2017-06-27 | Dropbox, Inc. | Collection folder for collecting file submissions via a customizable file request |
US10089479B2 (en) | 2015-04-17 | 2018-10-02 | Dropbox, Inc. | Collection folder for collecting file submissions from authenticated submitters |
US10885209B2 (en) | 2015-04-17 | 2021-01-05 | Dropbox, Inc. | Collection folder for collecting file submissions in response to a public file request |
US20160321629A1 (en) * | 2015-05-01 | 2016-11-03 | Monegraph, Inc. | Digital content rights transfers within social networks |
US10360255B2 (en) * | 2015-12-08 | 2019-07-23 | Facebook, Inc. | Systems and methods to determine location of media items |
US9723344B1 (en) * | 2015-12-29 | 2017-08-01 | Google Inc. | Early detection of policy violating media |
US10713966B2 (en) | 2015-12-31 | 2020-07-14 | Dropbox, Inc. | Assignments for classrooms |
US10572558B2 (en) * | 2016-03-07 | 2020-02-25 | At&T Intellectual Property I, L.P. | Method and system for providing expertise collaboration |
RU2660593C2 (en) | 2016-04-07 | 2018-07-06 | Общество С Ограниченной Ответственностью "Яндекс" | Method and server of defining the original reference to the original object |
US10909173B2 (en) | 2016-12-09 | 2021-02-02 | The Nielsen Company (Us), Llc | Scalable architectures for reference signature matching and updating |
US11128675B2 (en) | 2017-03-20 | 2021-09-21 | At&T Intellectual Property I, L.P. | Automatic ad-hoc multimedia conference generator |
US20180300701A1 (en) * | 2017-04-12 | 2018-10-18 | Facebook, Inc. | Systems and methods for content management |
CN109190879B (en) * | 2018-07-18 | 2020-08-11 | 阿里巴巴集团控股有限公司 | Method and device for training adaptation level evaluation model and evaluating adaptation level |
US10769263B1 (en) | 2019-05-07 | 2020-09-08 | Alibaba Group Holding Limited | Certificate verification |
US11809482B2 (en) | 2019-08-12 | 2023-11-07 | Medex Forensics, Inc. | Source identifying forensics system, device, and method for multimedia files |
US10885347B1 (en) | 2019-09-18 | 2021-01-05 | International Business Machines Corporation | Out-of-context video detection |
US12192582B2 (en) * | 2021-12-16 | 2025-01-07 | Mux, Inc. | System and method for removing copyrighted material from a streaming platform |
Citations (251)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4677466A (en) | 1985-07-29 | 1987-06-30 | A. C. Nielsen Company | Broadcast program identification method and apparatus |
US5210820A (en) | 1990-05-02 | 1993-05-11 | Broadcast Data Systems Limited Partnership | Signal recognition system and method |
US5629980A (en) | 1994-11-23 | 1997-05-13 | Xerox Corporation | System for controlling the distribution and use of digital works |
US5634012A (en) | 1994-11-23 | 1997-05-27 | Xerox Corporation | System for controlling the distribution and use of digital works having a fee reporting mechanism |
US5664018A (en) | 1996-03-12 | 1997-09-02 | Leighton; Frank Thomson | Watermarking process resilient to collusion attacks |
US5679940A (en) | 1994-12-02 | 1997-10-21 | Telecheck International, Inc. | Transaction system with on/off line risk assessment |
US5715403A (en) | 1994-11-23 | 1998-02-03 | Xerox Corporation | System for controlling the distribution and use of digital works having attached usage rights where the usage rights are defined by a usage rights grammar |
US5892536A (en) | 1996-10-03 | 1999-04-06 | Personal Audio | Systems and methods for computer enhanced broadcast monitoring |
US5910987A (en) | 1995-02-13 | 1999-06-08 | Intertrust Technologies Corp. | Systems and methods for secure transaction management and electronic rights protection |
US5913205A (en) | 1996-03-29 | 1999-06-15 | Virage, Inc. | Query optimization for visual information retrieval system |
US5918223A (en) | 1996-07-22 | 1999-06-29 | Muscle Fish | Method and article of manufacture for content-based analysis, storage, retrieval, and segmentation of audio information |
US6035055A (en) | 1997-11-03 | 2000-03-07 | Hewlett-Packard Company | Digital image management system in a distributed data access network system |
US6091822A (en) | 1998-01-08 | 2000-07-18 | Macrovision Corporation | Method and apparatus for recording scrambled video audio signals and playing back said video signal, descrambled, within a secure environment |
US6121530A (en) | 1998-03-19 | 2000-09-19 | Sonoda; Tomonari | World Wide Web-based melody retrieval system with thresholds determined by using distribution of pitch and span of notes |
US6236971B1 (en) | 1994-11-23 | 2001-05-22 | Contentguard Holdings, Inc. | System for controlling the distribution and use of digital works using digital tickets |
US20010010756A1 (en) | 1997-01-23 | 2001-08-02 | Sony Corporation | Information signal output control method, information signal duplication prevention method, information signal duplication prevention device, and information signal recording medium |
US6292575B1 (en) | 1998-07-20 | 2001-09-18 | Lau Technologies | Real-time facial recognition and verification system |
US6295439B1 (en) | 1997-03-21 | 2001-09-25 | Educational Testing Service | Methods and systems for presentation and evaluation of constructed responses assessed by human evaluators |
US6301370B1 (en) | 1998-04-13 | 2001-10-09 | Eyematic Interfaces, Inc. | Face recognition from video images |
WO2002011033A1 (en) | 2000-07-28 | 2002-02-07 | Copyright.Net Inc. | Apparatus and method for transmitting and keeping track of legal notices |
US20020028000A1 (en) | 1999-05-19 | 2002-03-07 | Conwell William Y. | Content identifiers triggering corresponding responses through collaborative processing |
US20020031253A1 (en) | 1998-12-04 | 2002-03-14 | Orang Dialameh | System and method for feature location and tracking in multiple dimensions including depth |
US20020038296A1 (en) | 2000-02-18 | 2002-03-28 | Margolus Norman H. | Data repository and method for promoting network storage of data |
US20020048369A1 (en) | 1995-02-13 | 2002-04-25 | Intertrust Technologies Corp. | Systems and methods for secure transaction management and electronic rights protection |
US20020052885A1 (en) | 2000-05-02 | 2002-05-02 | Levy Kenneth L. | Using embedded data with file sharing |
US20020069370A1 (en) | 2000-08-31 | 2002-06-06 | Infoseer, Inc. | System and method for tracking and preventing illegal distribution of proprietary material over computer networks |
US6407680B1 (en) | 2000-12-22 | 2002-06-18 | Generic Media, Inc. | Distributed on-demand media transcoding system and method |
US20020082731A1 (en) | 2000-11-03 | 2002-06-27 | International Business Machines Corporation | System for monitoring audio content in a video broadcast |
US20020082999A1 (en) | 2000-10-19 | 2002-06-27 | Cheol-Woong Lee | Method of preventing reduction of sales amount of records due to digital music file illegally distributed through communication network |
US20020087885A1 (en) | 2001-01-03 | 2002-07-04 | Vidius Inc. | Method and application for a reactive defense against illegal distribution of multimedia content in file sharing networks |
US6430306B2 (en) | 1995-03-20 | 2002-08-06 | Lau Technologies | Systems and methods for identifying images |
WO2002065782A1 (en) | 2001-02-12 | 2002-08-22 | Koninklijke Philips Electronics N.V. | Generating and matching hashes of multimedia content |
US20020141578A1 (en) | 2001-03-29 | 2002-10-03 | Ripley Michael S. | Method and apparatus for content protection across a source-to-destination interface |
US6466695B1 (en) | 1999-08-04 | 2002-10-15 | Eyematic Interfaces, Inc. | Procedure for automatic analysis of images and image sequences based on two-dimensional shape primitives |
US20020152215A1 (en) | 2000-10-25 | 2002-10-17 | Clark George Philip | Distributing electronic books over a computer network |
US20020165819A1 (en) | 2001-05-02 | 2002-11-07 | Gateway, Inc. | System and method for providing distributed computing services |
US20020168082A1 (en) | 2001-03-07 | 2002-11-14 | Ravi Razdan | Real-time, distributed, transactional, hybrid watermarking method to provide trace-ability and copyright protection of digital content in peer-to-peer networks |
US20020174132A1 (en) | 2001-05-04 | 2002-11-21 | Allresearch, Inc. | Method and system for detecting unauthorized trademark use on the internet |
US20020178271A1 (en) | 2000-11-20 | 2002-11-28 | Graham Todd D. | Dynamic file access control and management |
WO2002103968A1 (en) | 2001-06-15 | 2002-12-27 | Beep Science As | An arrangement and a method for content policy control in a mobile multimedia messaging system |
US6505160B1 (en) | 1995-07-27 | 2003-01-07 | Digimarc Corporation | Connected audio and other media objects |
US20030018709A1 (en) | 2001-07-20 | 2003-01-23 | Audible Magic | Playlist generation method and apparatus |
US6513018B1 (en) | 1994-05-05 | 2003-01-28 | Fair, Isaac And Company, Inc. | Method and apparatus for scoring the likelihood of a desired performance result |
US20030021441A1 (en) | 1995-07-27 | 2003-01-30 | Levy Kenneth L. | Connected audio and other media objects |
US20030023852A1 (en) | 2001-07-10 | 2003-01-30 | Wold Erling H. | Method and apparatus for identifying an unkown work |
US20030033321A1 (en) | 2001-07-20 | 2003-02-13 | Audible Magic, Inc. | Method and apparatus for identifying new media content |
US20030037010A1 (en) | 2001-04-05 | 2003-02-20 | Audible Magic, Inc. | Copyright detection and protection system and method |
US20030052768A1 (en) | 2001-09-17 | 2003-03-20 | Maune James J. | Security method and system |
US20030061490A1 (en) | 2001-09-26 | 2003-03-27 | Abajian Aram Christian | Method for identifying copyright infringement violations by fingerprint detection |
US20030086341A1 (en) | 2001-07-20 | 2003-05-08 | Gracenote, Inc. | Automatic identification of sound recordings |
US6563950B1 (en) | 1996-06-25 | 2003-05-13 | Eyematic Interfaces, Inc. | Labeled bunch graphs for image analysis |
US20030093790A1 (en) | 2000-03-28 | 2003-05-15 | Logan James D. | Audio and video program recording, editing and playback systems using metadata |
US20030101104A1 (en) | 2001-11-28 | 2003-05-29 | Koninklijke Philips Electronics N.V. | System and method for retrieving information related to targeted subjects |
US20030099379A1 (en) | 2001-11-26 | 2003-05-29 | Monk Bruce C. | Validation and verification apparatus and method |
US6574609B1 (en) | 1998-08-13 | 2003-06-03 | International Business Machines Corporation | Secure electronic content management system |
US20030115459A1 (en) | 2001-12-17 | 2003-06-19 | Monk Bruce C. | Document and bearer verification system |
US20030135623A1 (en) | 2001-10-23 | 2003-07-17 | Audible Magic, Inc. | Method and apparatus for cache promotion |
US6597775B2 (en) | 2000-09-29 | 2003-07-22 | Fair Isaac Corporation | Self-learning real-time prioritization of telecommunication fraud control actions |
US6647548B1 (en) | 1996-09-06 | 2003-11-11 | Nielsen Media Research, Inc. | Coded/non-coded program audience measurement system |
US20030216988A1 (en) | 2002-05-17 | 2003-11-20 | Cassandra Mollett | Systems and methods for using phone number validation in a risk assessment |
US20030216824A1 (en) * | 2002-05-14 | 2003-11-20 | Docomo Communications Laboratories Usa, Inc. | Method and apparatus for self-degrading digital data |
US20030231785A1 (en) | 1993-11-18 | 2003-12-18 | Rhoads Geoffrey B. | Watermark embedder and reader |
US20040010602A1 (en) | 2002-07-10 | 2004-01-15 | Van Vleck Paul F. | System and method for managing access to digital content via digital rights policies |
US6684254B1 (en) | 2000-05-31 | 2004-01-27 | International Business Machines Corporation | Hyperlink filter for “pirated” and “disputed” copyright material on the internet in a method, system and program |
US20040022444A1 (en) | 1993-11-18 | 2004-02-05 | Rhoads Geoffrey B. | Authentication using a digital watermark |
US6693236B1 (en) | 1999-12-28 | 2004-02-17 | Monkeymedia, Inc. | User interface for simultaneous management of owned and unowned inventory |
US20040054661A1 (en) | 2002-09-13 | 2004-03-18 | Dominic Cheung | Automated processing of appropriateness determination of content for search listings in wide area network searches |
US20040059953A1 (en) | 2002-09-24 | 2004-03-25 | Arinc | Methods and systems for identity management |
US20040064415A1 (en) | 2002-07-12 | 2004-04-01 | Abdallah David S. | Personal authentication software and systems for travel privilege assignation and verification |
US20040071314A1 (en) | 1999-10-28 | 2004-04-15 | Yacov Yacobi | Methods and systems for fingerprinting digital data |
US6772196B1 (en) | 2000-07-27 | 2004-08-03 | Propel Software Corp. | Electronic mail filtering system and methods |
US20040153663A1 (en) | 2002-11-01 | 2004-08-05 | Clark Robert T. | System, method and computer program product for assessing risk of identity theft |
US20040163106A1 (en) | 2003-02-01 | 2004-08-19 | Audible Magic, Inc. | Method and apparatus to identify a work received by a processing system |
US6795638B1 (en) | 1999-09-30 | 2004-09-21 | New Jersey Devils, Llc | System and method for recording and preparing statistics concerning live performances |
US20040189441A1 (en) | 2003-03-24 | 2004-09-30 | Kosmas Stergiou | Apparatus and methods for verification and authentication employing voluntary attributes, knowledge management and databases |
US20040205030A1 (en) | 2001-10-24 | 2004-10-14 | Capital Confirmation, Inc. | Systems, methods and computer readable medium providing automated third-party confirmations |
US6810388B1 (en) | 2000-03-24 | 2004-10-26 | Trinity Security Systems, Inc. | Digital contents copying inhibition apparatus, digital contents copying inhibition method, and computer products |
US20040213437A1 (en) | 2002-11-26 | 2004-10-28 | Howard James V | Systems and methods for managing and detecting fraud in image databases used with identification documents |
US20040221118A1 (en) | 2003-01-29 | 2004-11-04 | Slater Alastair Michael | Control of access to data content for read and/or write operations |
US20040225645A1 (en) | 2003-05-06 | 2004-11-11 | Rowney Kevin T. | Personal computing device -based mechanism to detect preselected data |
US20040230529A1 (en) | 2001-11-20 | 2004-11-18 | Contentguard Holdings, Inc. | System and method for granting access to an item or permission to use an item based on configurable conditions |
US20040230527A1 (en) | 2003-04-29 | 2004-11-18 | First Data Corporation | Authentication for online money transfers |
US20040243567A1 (en) | 2003-03-03 | 2004-12-02 | Levy Kenneth L. | Integrating and enhancing searching of media content and biometric databases |
US6829368B2 (en) | 2000-01-26 | 2004-12-07 | Digimarc Corporation | Establishing and interacting with on-line media collections using identifiers in media signals |
US20040245330A1 (en) | 2003-04-03 | 2004-12-09 | Amy Swift | Suspicious persons database |
US20040255147A1 (en) | 2003-05-06 | 2004-12-16 | Vidius Inc. | Apparatus and method for assuring compliance with distribution and usage policy |
US6834308B1 (en) | 2000-02-17 | 2004-12-21 | Audible Magic Corporation | Method and apparatus for identifying media content presented on a media playing device |
US20040267552A1 (en) | 2003-06-26 | 2004-12-30 | Contentguard Holdings, Inc. | System and method for controlling rights expressions by stakeholders of an item |
US20050008225A1 (en) | 2003-06-27 | 2005-01-13 | Hiroyuki Yanagisawa | System, apparatus, and method for providing illegal use research service for image data, and system, apparatus, and method for providing proper use research service for image data |
US20050025335A1 (en) | 2002-04-18 | 2005-02-03 | Bloom Jeffrey Adam | Method and apparatus for providing an asymmetric watermark carrier |
US20050039057A1 (en) | 2003-07-24 | 2005-02-17 | Amit Bagga | Method and apparatus for authenticating a user using query directed passwords |
US20050043960A1 (en) | 2003-08-19 | 2005-02-24 | David Blankley | System and automate the licensing, re-use and royalties of authored content in derivative works |
US20050043548A1 (en) | 2003-08-22 | 2005-02-24 | Joseph Cates | Automated monitoring and control system for networked communications |
US20050060643A1 (en) * | 2003-08-25 | 2005-03-17 | Miavia, Inc. | Document similarity detection and classification system |
US6871200B2 (en) | 2002-07-11 | 2005-03-22 | Forensic Eye Ltd. | Registration and monitoring system |
US20050080846A1 (en) | 2003-09-27 | 2005-04-14 | Webhound, Inc. | Method and system for updating digital content over a network |
US6889383B1 (en) | 2000-10-23 | 2005-05-03 | Clearplay, Inc. | Delivery of navigation data for playback of audio and video content |
US20050102515A1 (en) | 2003-02-03 | 2005-05-12 | Dave Jaworski | Controlling read and write operations for digital media |
US20050105726A1 (en) | 2002-04-12 | 2005-05-19 | Christian Neubauer | Method and device for embedding watermark information and method and device for extracting embedded watermark information |
US6898799B1 (en) | 2000-10-23 | 2005-05-24 | Clearplay, Inc. | Multimedia content navigation and playback |
US20050125358A1 (en) | 2003-12-04 | 2005-06-09 | Black Duck Software, Inc. | Authenticating licenses for legally-protectable content based on license profiles and content identifiers |
US20050132235A1 (en) | 2003-12-15 | 2005-06-16 | Remco Teunen | System and method for providing improved claimant authentication |
US20050141707A1 (en) | 2002-02-05 | 2005-06-30 | Haitsma Jaap A. | Efficient storage of fingerprints |
US20050154924A1 (en) | 1998-02-13 | 2005-07-14 | Scheidt Edward M. | Multiple factor-based user identification and authentication |
US20050171851A1 (en) | 2004-01-30 | 2005-08-04 | Applebaum Ted H. | Multiple choice challenge-response user authorization system and method |
US20050192902A1 (en) * | 2003-12-05 | 2005-09-01 | Motion Picture Association Of America | Digital rights management using multiple independent parameters |
US20050193408A1 (en) | 2000-07-24 | 2005-09-01 | Vivcom, Inc. | Generating, transporting, processing, storing and presenting segmentation information for audio-visual programs |
US20050193016A1 (en) | 2004-02-17 | 2005-09-01 | Nicholas Seet | Generation of a media content database by correlating repeating media content in media streams |
US6944604B1 (en) | 2001-07-03 | 2005-09-13 | Fair Isaac Corporation | Mechanism and method for specified temporal deployment of rules within a rule server |
US20050222900A1 (en) | 2004-03-30 | 2005-10-06 | Prashant Fuloria | Selectively delivering advertisements based at least in part on trademark issues |
US20050246752A1 (en) | 1999-08-03 | 2005-11-03 | Gad Liwerant | Method and system for sharing video over a network |
US6965889B2 (en) | 2000-05-09 | 2005-11-15 | Fair Isaac Corporation | Approach for generating rules |
US6968328B1 (en) | 2000-12-29 | 2005-11-22 | Fair Isaac Corporation | Method and system for implementing rules and ruleflows |
US20050259819A1 (en) | 2002-06-24 | 2005-11-24 | Koninklijke Philips Electronics | Method for generating hashes from a compressed multimedia content |
US20050273617A1 (en) | 2001-04-24 | 2005-12-08 | Microsoft Corporation | Robust recognizer of perceptually similar content |
US20050273612A1 (en) | 2002-07-26 | 2005-12-08 | Koninklijke Philips Electronics N.V. | Identification of digital data sequences |
US6976165B1 (en) | 1999-09-07 | 2005-12-13 | Emc Corporation | System and method for secure storage, transfer and retrieval of content addressable information |
US20050288952A1 (en) | 2004-05-18 | 2005-12-29 | Davis Bruce L | Official documents and methods of issuance |
US6983371B1 (en) | 1998-10-22 | 2006-01-03 | International Business Machines Corporation | Super-distribution of protected digital content |
US20060010500A1 (en) * | 2004-02-03 | 2006-01-12 | Gidon Elazar | Protection of digital data content |
US6990453B2 (en) | 2000-07-31 | 2006-01-24 | Landmark Digital Services Llc | System and methods for recognizing sound and music signals in high noise and distortion |
US20060031870A1 (en) | 2000-10-23 | 2006-02-09 | Jarman Matthew T | Apparatus, system, and method for filtering objectionable portions of a multimedia presentation |
US20060034177A1 (en) | 2004-07-28 | 2006-02-16 | Audible Magic Corporation | System for distributing decoy content in a peer to peer network |
US7003131B2 (en) | 2002-07-09 | 2006-02-21 | Kaleidescape, Inc. | Watermarking and fingerprinting digital content using alternative blocks to embed information |
US20060050993A1 (en) | 2002-12-19 | 2006-03-09 | Stentiford Frederick W | Searching images |
US20060059561A1 (en) | 2004-04-14 | 2006-03-16 | Digital River, Inc. | Electronic storefront that limits download of software wrappers based on geographic location |
US7020635B2 (en) | 2001-11-21 | 2006-03-28 | Line 6, Inc | System and method of secure electronic commerce transactions including tracking and recording the distribution and usage of assets |
US20060075237A1 (en) | 2002-11-12 | 2006-04-06 | Koninklijke Philips Electronics N.V. | Fingerprinting multimedia contents |
US20060080356A1 (en) | 2004-10-13 | 2006-04-13 | Microsoft Corporation | System and method for inferring similarities between media objects |
US20060085816A1 (en) | 2004-10-18 | 2006-04-20 | Funk James M | Method and apparatus to control playback in a download-and-view video on demand system |
US7043473B1 (en) | 2000-11-22 | 2006-05-09 | Widevine Technologies, Inc. | Media tracking system and method |
US7047241B1 (en) | 1995-10-13 | 2006-05-16 | Digimarc Corporation | System and methods for managing digital creative works |
US20060106725A1 (en) | 2004-11-12 | 2006-05-18 | International Business Machines Corporation | Method, system, and program product for visual display of a license status for a software program |
US20060106675A1 (en) | 2004-11-16 | 2006-05-18 | Cohen Peter D | Providing an electronic marketplace to facilitate human performance of programmatically submitted tasks |
US20060106774A1 (en) | 2004-11-16 | 2006-05-18 | Cohen Peter D | Using qualifications of users to facilitate user performance of tasks |
US20060110137A1 (en) | 2004-11-25 | 2006-05-25 | Matsushita Electric Industrial Co., Ltd. | Video and audio data transmitting apparatus, and video and audio data transmitting method |
US20060112015A1 (en) | 2004-11-24 | 2006-05-25 | Contentguard Holdings, Inc. | Method, system, and device for handling creation of derivative works and for adapting rights to derivative works |
US20060115108A1 (en) | 2004-06-22 | 2006-06-01 | Rodriguez Tony F | Metadata management and generation using digital watermarks |
US7085741B2 (en) | 2001-01-17 | 2006-08-01 | Contentguard Holdings, Inc. | Method and apparatus for managing digital content usage rights |
US20060174348A1 (en) | 1999-05-19 | 2006-08-03 | Rhoads Geoffrey B | Watermark-based personal audio appliance |
US20060171474A1 (en) | 2002-10-23 | 2006-08-03 | Nielsen Media Research | Digital data insertion apparatus and methods for use with compressed audio/video data |
US20060177198A1 (en) | 2004-10-20 | 2006-08-10 | Jarman Matthew T | Media player configured to receive playback filters from alternative storage mediums |
US20060212927A1 (en) | 2002-12-20 | 2006-09-21 | Kabushiki Kaisha Toshiba | Content management system, recording medium and method |
US20060218126A1 (en) | 2003-03-13 | 2006-09-28 | Hendrikus Albertus De Ruijter | Data retrieval method and system |
US7117513B2 (en) | 2001-11-09 | 2006-10-03 | Nielsen Media Research, Inc. | Apparatus and method for detecting and correcting a corrupted broadcast time code |
US20060230358A1 (en) | 2003-05-02 | 2006-10-12 | Jorn Sacher | System for inspecting a printed image |
US20060240862A1 (en) | 2004-02-20 | 2006-10-26 | Hartmut Neven | Mobile image-based information retrieval system |
US20060277564A1 (en) | 2003-10-22 | 2006-12-07 | Jarman Matthew T | Apparatus and method for blocking audio/visual programming and for muting audio |
US20060287916A1 (en) | 2005-06-15 | 2006-12-21 | Steven Starr | Media marketplaces |
US20060287996A1 (en) | 2005-06-16 | 2006-12-21 | International Business Machines Corporation | Computer-implemented method, system, and program product for tracking content |
US20070028308A1 (en) | 2005-07-29 | 2007-02-01 | Kosuke Nishio | Decoding apparatus |
US20070033397A1 (en) * | 2003-10-20 | 2007-02-08 | Phillips Ii Eugene B | Securing digital content system and method |
US20070038567A1 (en) | 2005-08-12 | 2007-02-15 | Jeremy Allaire | Distribution of content |
US7185201B2 (en) | 1999-05-19 | 2007-02-27 | Digimarc Corporation | Content identifiers triggering corresponding responses |
US20070061393A1 (en) | 2005-02-01 | 2007-03-15 | Moore James F | Management of health care data |
US20070058925A1 (en) | 2005-09-14 | 2007-03-15 | Fu-Sheng Chiu | Interactive multimedia production |
US7194490B2 (en) | 2001-05-22 | 2007-03-20 | Christopher Zee | Method for the assured and enduring archival of intellectual property |
US7197459B1 (en) | 2001-03-19 | 2007-03-27 | Amazon Technologies, Inc. | Hybrid machine/human computing arrangement |
US20070083883A1 (en) | 2004-03-29 | 2007-04-12 | Deng Kevin K | Methods and apparatus to detect a blank frame in a digital video broadcast signal |
US20070094145A1 (en) | 2005-10-24 | 2007-04-26 | Contentguard Holdings, Inc. | Method and system to support dynamic rights and resources sharing |
US20070098172A1 (en) | 2002-07-16 | 2007-05-03 | Levy Kenneth L | Digital Watermarking Applications |
US20070101360A1 (en) | 2003-11-17 | 2007-05-03 | Koninklijke Philips Electronics, N.V. | Commercial insertion into video streams based on surrounding program content |
US20070110010A1 (en) | 2005-11-14 | 2007-05-17 | Sakari Kotola | Portable local server with context sensing |
US20070110089A1 (en) * | 2003-11-27 | 2007-05-17 | Advestigo | System for intercepting multimedia documents |
US20070124251A1 (en) * | 2003-10-16 | 2007-05-31 | Sharp Kabushiki Kaisha | Content use control device, reording device, reproduction device, recording medium, and content use control method |
US20070124756A1 (en) | 2005-11-29 | 2007-05-31 | Google Inc. | Detecting Repeating Content in Broadcast Media |
US20070130177A1 (en) | 2005-09-23 | 2007-06-07 | Tina Schneider | Media management system |
US20070130015A1 (en) | 2005-06-15 | 2007-06-07 | Steven Starr | Advertisement revenue sharing for distributed video |
US20070154190A1 (en) * | 2005-05-23 | 2007-07-05 | Gilley Thomas S | Content tracking for movie segment bookmarks |
US20070156594A1 (en) | 2006-01-03 | 2007-07-05 | Mcgucken Elliot | System and method for allowing creators, artsists, and owners to protect and profit from content |
US20070162761A1 (en) | 2005-12-23 | 2007-07-12 | Davis Bruce L | Methods and Systems to Help Detect Identity Fraud |
US20070168543A1 (en) | 2004-06-07 | 2007-07-19 | Jason Krikorian | Capturing and Sharing Media Content |
US20070180537A1 (en) | 2005-01-07 | 2007-08-02 | Shan He | Method for fingerprinting multimedia content |
US20070175998A1 (en) | 2005-09-01 | 2007-08-02 | Lev Zvi H | System and method for reliable content access using a cellular/wireless device with imaging capabilities |
US20070192352A1 (en) | 2005-12-21 | 2007-08-16 | Levy Kenneth L | Content Metadata Directory Services |
US20070198426A1 (en) | 2004-03-04 | 2007-08-23 | Yates James M | Method and apparatus for digital copyright exchange |
US20070203911A1 (en) | 2006-02-07 | 2007-08-30 | Fu-Sheng Chiu | Video weblog |
US7266704B2 (en) | 2000-12-18 | 2007-09-04 | Digimarc Corporation | User-friendly rights management systems and methods |
US20070208751A1 (en) | 2005-11-22 | 2007-09-06 | David Cowan | Personalized content control |
US20070211174A1 (en) | 2005-01-05 | 2007-09-13 | Daniel Putterman | Windows management in a television environment |
US20070220575A1 (en) | 2006-03-03 | 2007-09-20 | Verimatrix, Inc. | Movie studio-based network distribution system and method |
US20070234213A1 (en) | 2004-06-07 | 2007-10-04 | Jason Krikorian | Selection and Presentation of Context-Relevant Supplemental Content And Advertising |
US20070242880A1 (en) | 2005-05-18 | 2007-10-18 | Stebbings David W | System and method for the identification of motional media of widely varying picture content |
US20070253594A1 (en) | 2006-04-28 | 2007-11-01 | Vobile, Inc. | Method and system for fingerprinting digital video object based on multiresolution, multirate spatial and temporal signatures |
US7298864B2 (en) | 2000-02-19 | 2007-11-20 | Digimarc Corporation | Digital watermarks as a gateway and control mechanism |
US20070282472A1 (en) | 2006-06-01 | 2007-12-06 | International Business Machines Corporation | System and method for customizing soundtracks |
US7314162B2 (en) | 2003-10-17 | 2008-01-01 | Digimore Corporation | Method and system for reporting identity document usage |
US20080002854A1 (en) | 2003-10-08 | 2008-01-03 | Verance Corporation | Signal continuity assessment using embedded watermarks |
US20080005241A1 (en) | 2006-06-30 | 2008-01-03 | Searete Llc, A Limited Liability Corporation Of The State Of Delaware | Usage parameters for communication content |
US20080027931A1 (en) | 2006-02-27 | 2008-01-31 | Vobile, Inc. | Systems and methods for publishing, searching, retrieving and binding metadata for a digital object |
US20080034396A1 (en) | 2006-05-30 | 2008-02-07 | Lev Zvi H | System and method for video distribution and billing |
US20080051029A1 (en) | 2006-08-25 | 2008-02-28 | Bradley James Witteman | Phone-based broadcast audio identification |
US20080059461A1 (en) | 2006-08-29 | 2008-03-06 | Attributor Corporation | Content search using a provided interface |
US20080059536A1 (en) | 2006-08-29 | 2008-03-06 | Attributor Corporation | Content monitoring and host compliance evaluation |
US7346472B1 (en) | 2000-09-07 | 2008-03-18 | Blue Spike, Inc. | Method and device for monitoring and analyzing signals |
US20080082405A1 (en) | 2006-09-29 | 2008-04-03 | Yahoo! Inc. | Digital media benefit attachment mechanism |
US7366787B2 (en) | 2001-06-08 | 2008-04-29 | Sun Microsystems, Inc. | Dynamic configuration of a content publisher |
US7369677B2 (en) | 2005-04-26 | 2008-05-06 | Verance Corporation | System reactions to the detection of embedded watermarks in a digital host content |
US7370017B1 (en) | 2002-12-20 | 2008-05-06 | Microsoft Corporation | Redistribution of rights-managed content and technique for encouraging same |
US20080109306A1 (en) | 2005-06-15 | 2008-05-08 | Maigret Robert J | Media marketplaces |
US20080109369A1 (en) | 2006-11-03 | 2008-05-08 | Yi-Ling Su | Content Management System |
US20080154401A1 (en) | 2004-04-19 | 2008-06-26 | Landmark Digital Services Llc | Method and System For Content Sampling and Identification |
US20080154739A1 (en) | 2006-12-22 | 2008-06-26 | Yahoo! Inc | Social Network Commerce Model |
US20080152146A1 (en) | 2005-01-24 | 2008-06-26 | Koninklijke Philips Electronics, N.V. | Private and Controlled Ownership Sharing |
US20080155701A1 (en) | 2006-12-22 | 2008-06-26 | Yahoo! Inc. | Method and system for unauthorized content detection and reporting |
US20080162449A1 (en) | 2006-12-28 | 2008-07-03 | Chen Chao-Yu | Dynamic page similarity measurement |
US20080162228A1 (en) | 2006-12-19 | 2008-07-03 | Friedrich Mechbach | Method and system for the integrating advertising in user generated contributions |
US20080165960A1 (en) | 2007-01-09 | 2008-07-10 | Tagstory Co., Ltd. | System for providing copyright-protected video data and method thereof |
US20080178302A1 (en) | 2007-01-19 | 2008-07-24 | Attributor Corporation | Determination of originality of content |
US20080209502A1 (en) | 2007-02-27 | 2008-08-28 | Seidel Craig H | Associating rights to multimedia content |
US7421723B2 (en) | 1999-01-07 | 2008-09-02 | Nielsen Media Research, Inc. | Detection of media links in broadcast signals |
US20080240490A1 (en) | 2007-03-30 | 2008-10-02 | Microsoft Corporation | Source authentication and usage tracking of video |
US20080249961A1 (en) | 2007-03-22 | 2008-10-09 | Harkness David H | Digital rights management and audience measurement systems and methods |
US20080317278A1 (en) | 2006-01-16 | 2008-12-25 | Frederic Lefebvre | Method for Computing a Fingerprint of a Video Sequence |
US20090006225A1 (en) | 2007-06-29 | 2009-01-01 | Microsoft Corporation | Distribution channels and monetizing |
US20090030651A1 (en) | 2007-07-27 | 2009-01-29 | Audible Magic Corporation | System for identifying content of digital data |
WO2009017049A1 (en) | 2007-07-27 | 2009-02-05 | Aisin Seiki Kabushiki Kaisha | Door handle device |
US20090052784A1 (en) | 2007-08-22 | 2009-02-26 | Michele Covell | Detection And Classification Of Matches Between Time-Based Media |
US20090083228A1 (en) | 2006-02-07 | 2009-03-26 | Mobixell Networks Ltd. | Matching of modified visual and audio media |
US7529659B2 (en) | 2005-09-28 | 2009-05-05 | Audible Magic Corporation | Method and apparatus for identifying an unknown work |
US20090119169A1 (en) | 2007-10-02 | 2009-05-07 | Blinkx Uk Ltd | Various methods and apparatuses for an engine that pairs advertisements with video files |
US20090129755A1 (en) | 2007-11-21 | 2009-05-21 | Shlomo Selim Rakib | Method and Apparatus for Generation, Distribution and Display of Interactive Video Content |
US20090144772A1 (en) | 2007-11-30 | 2009-06-04 | Google Inc. | Video object tag creation and processing |
US20090144325A1 (en) | 2006-11-03 | 2009-06-04 | Franck Chastagnol | Blocking of Unlicensed Audio Content in Video Files on a Video Hosting Website |
US20090165031A1 (en) | 2007-12-19 | 2009-06-25 | At&T Knowledge Ventures, L.P. | Systems and Methods to Identify Target Video Content |
US7562012B1 (en) | 2000-11-03 | 2009-07-14 | Audible Magic Corporation | Method and apparatus for creating a unique audio signature |
WO2009100093A1 (en) | 2008-02-05 | 2009-08-13 | Dolby Laboratories Licensing Corporation | Associating information with media content |
US20090313078A1 (en) | 2008-06-12 | 2009-12-17 | Cross Geoffrey Mark Timothy | Hybrid human/computer image processing method |
US20100017487A1 (en) | 2004-11-04 | 2010-01-21 | Vericept Corporation | Method, apparatus, and system for clustering and classification |
US7653552B2 (en) | 2001-03-21 | 2010-01-26 | Qurio Holdings, Inc. | Digital file marketplace |
US7681032B2 (en) | 2001-03-12 | 2010-03-16 | Portauthority Technologies Inc. | System and method for monitoring unauthorized transport of digital content |
US7707427B1 (en) | 2004-07-19 | 2010-04-27 | Michael Frederick Kenrich | Multi-level file digests |
US7711731B2 (en) | 2002-01-11 | 2010-05-04 | International Business Machines Corporation | Synthesizing information-bearing content from multiple channels |
US7730316B1 (en) | 2006-09-22 | 2010-06-01 | Fatlens, Inc. | Method for document fingerprinting |
US7761465B1 (en) * | 1999-09-17 | 2010-07-20 | Sony Corporation | Data providing system and method therefor |
US20100191819A1 (en) | 2003-01-24 | 2010-07-29 | Aol Inc. | Group Based Spam Classification |
US7783489B2 (en) | 1999-09-21 | 2010-08-24 | Iceberg Industries Llc | Audio identification system and method |
US7831531B1 (en) | 2006-06-22 | 2010-11-09 | Google Inc. | Approximate hashing functions for finding similar content |
US7870574B2 (en) | 1999-09-21 | 2011-01-11 | Iceberg Industries, Llc | Method and apparatus for automatically recognizing input audio and/or video streams |
US7881957B1 (en) | 2004-11-16 | 2011-02-01 | Amazon Technologies, Inc. | Identifying tasks for task performers based on task subscriptions |
US7899694B1 (en) | 2006-06-30 | 2011-03-01 | Amazon Technologies, Inc. | Generating solutions to problems via interactions with human responders |
US7945600B1 (en) * | 2001-05-18 | 2011-05-17 | Stratify, Inc. | Techniques for organizing data to support efficient review and analysis |
US7945470B1 (en) | 2006-09-29 | 2011-05-17 | Amazon Technologies, Inc. | Facilitating performance of submitted tasks by mobile task performers |
US8010511B2 (en) | 2006-08-29 | 2011-08-30 | Attributor Corporation | Content monitoring and compliance enforcement |
US8090717B1 (en) | 2002-09-20 | 2012-01-03 | Google Inc. | Methods and apparatus for ranking documents |
US8185967B2 (en) | 1999-03-10 | 2012-05-22 | Digimarc Corporation | Method and apparatus for content management |
US20120284803A1 (en) | 2001-01-17 | 2012-11-08 | Contentguard Holdings, Inc. | Method and apparatus for distributing enforceable property rights |
US20130006946A1 (en) | 2006-12-22 | 2013-01-03 | Commvault Systems, Inc. | System and method for storing redundant information |
US20130022333A1 (en) | 2004-12-03 | 2013-01-24 | Nec Corporation | Video content playback assistance method, video content playback assistance system, and information distribution program |
US20130085825A1 (en) | 2006-12-20 | 2013-04-04 | Digimarc Corp. | Method and system for determining content treatment |
US20150302886A1 (en) | 2006-08-29 | 2015-10-22 | Digimarc Corporation | Determination of originality of content |
Family Cites Families (48)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6947571B1 (en) * | 1999-05-19 | 2005-09-20 | Digimarc Corporation | Cell phones with optical capabilities, and related applications |
US5765152A (en) * | 1995-10-13 | 1998-06-09 | Trustees Of Dartmouth College | System and method for managing copyrighted electronic media |
US7159116B2 (en) * | 1999-12-07 | 2007-01-02 | Blue Spike, Inc. | Systems, methods and devices for trusted transactions |
US6108637A (en) * | 1996-09-03 | 2000-08-22 | Nielsen Media Research, Inc. | Content display monitor |
US5983351A (en) * | 1996-10-16 | 1999-11-09 | Intellectual Protocols, L.L.C. | Web site copyright registration system and method |
US6251016B1 (en) * | 1997-01-07 | 2001-06-26 | Fujitsu Limited | Information offering system for providing a lottery on a network |
WO1998037473A2 (en) * | 1997-02-07 | 1998-08-27 | General Internet, Inc. | Collaborative internet data mining system |
US6055538A (en) * | 1997-12-22 | 2000-04-25 | Hewlett Packard Company | Methods and system for using web browser to search large collections of documents |
US7051004B2 (en) * | 1998-04-03 | 2006-05-23 | Macrovision Corporation | System and methods providing secure delivery of licenses and content |
US6401118B1 (en) * | 1998-06-30 | 2002-06-04 | Online Monitoring Services | Method and computer program product for an online monitoring search engine |
US6603921B1 (en) * | 1998-07-01 | 2003-08-05 | International Business Machines Corporation | Audio/video archive system and method for automatic indexing and searching |
US7346605B1 (en) * | 1999-07-22 | 2008-03-18 | Markmonitor, Inc. | Method and system for searching and monitoring internet trademark usage |
US6493744B1 (en) * | 1999-08-16 | 2002-12-10 | International Business Machines Corporation | Automatic rating and filtering of data files for objectionable content |
US6546135B1 (en) * | 1999-08-30 | 2003-04-08 | Mitsubishi Electric Research Laboratories, Inc | Method for representing and comparing multimedia content |
WO2001022310A1 (en) * | 1999-09-22 | 2001-03-29 | Oleg Kharisovich Zommers | Interactive personal information system and method |
US6807634B1 (en) * | 1999-11-30 | 2004-10-19 | International Business Machines Corporation | Watermarks for customer identification |
US20020002586A1 (en) * | 2000-02-08 | 2002-01-03 | Howard Rafal | Methods and apparatus for creating and hosting customized virtual parties via the internet |
US20060080200A1 (en) * | 2000-04-07 | 2006-04-13 | Ashton David M | System and method for benefit plan administration |
US6952769B1 (en) * | 2000-04-17 | 2005-10-04 | International Business Machines Corporation | Protocols for anonymous electronic communication and double-blind transactions |
AU2001288670A1 (en) * | 2000-08-31 | 2002-03-13 | Myrio Corporation | Real-time audience monitoring, content rating, and content enhancing |
WO2002033505A2 (en) * | 2000-10-16 | 2002-04-25 | Vidius Inc. | A method and apparatus for supporting electronic content distribution |
US7120273B2 (en) * | 2002-05-31 | 2006-10-10 | Hewlett-Packard Development Company, Lp. | Apparatus and method for image group integrity protection |
US6931413B2 (en) * | 2002-06-25 | 2005-08-16 | Microsoft Corporation | System and method providing automated margin tree analysis and processing of sampled data |
JP4366916B2 (en) * | 2002-10-29 | 2009-11-18 | 富士ゼロックス株式会社 | Document confirmation system, document confirmation method, and document confirmation program |
JP2004180278A (en) * | 2002-11-15 | 2004-06-24 | Canon Inc | Information processing apparatus, server device, electronic data management system, information processing system, information processing method, computer program, and computer-readable storage medium |
JP2005004728A (en) * | 2003-05-20 | 2005-01-06 | Canon Inc | Information processing system, information processing device, information processing method, storage medium storing program for executing same so that program can be read out to information processing device, and program |
EP2557521A3 (en) * | 2003-07-07 | 2014-01-01 | Rovi Solutions Corporation | Reprogrammable security for controlling piracy and enabling interactive content |
US7444403B1 (en) * | 2003-11-25 | 2008-10-28 | Microsoft Corporation | Detecting sexually predatory content in an electronic communication |
US8255331B2 (en) * | 2004-03-04 | 2012-08-28 | Media Rights Technologies, Inc. | Method for providing curriculum enhancement using a computer-based media access system |
US20060080703A1 (en) * | 2004-03-22 | 2006-04-13 | Compton Charles L | Content storage method and system |
US20050276570A1 (en) * | 2004-06-15 | 2005-12-15 | Reed Ogden C Jr | Systems, processes and apparatus for creating, processing and interacting with audiobooks and other media |
JP2006039791A (en) * | 2004-07-26 | 2006-02-09 | Matsushita Electric Ind Co Ltd | Transmission history dependent processor |
JP4817624B2 (en) * | 2004-08-06 | 2011-11-16 | キヤノン株式会社 | Image processing system, image alteration judgment method, computer program, and computer-readable storage medium |
US7860922B2 (en) * | 2004-08-18 | 2010-12-28 | Time Warner, Inc. | Method and device for the wireless exchange of media content between mobile devices based on content preferences |
US7555487B2 (en) * | 2004-08-20 | 2009-06-30 | Xweb, Inc. | Image processing and identification system, method and apparatus |
US8117339B2 (en) * | 2004-10-29 | 2012-02-14 | Go Daddy Operating Company, LLC | Tracking domain name related reputation |
GB0424479D0 (en) * | 2004-11-05 | 2004-12-08 | Ibm | Generating a fingerprint for a document |
US7562228B2 (en) * | 2005-03-15 | 2009-07-14 | Microsoft Corporation | Forensic for fingerprint detection in multimedia |
US8365306B2 (en) * | 2005-05-25 | 2013-01-29 | Oracle International Corporation | Platform and service for management and multi-channel delivery of multi-types of contents |
JP2007011554A (en) * | 2005-06-29 | 2007-01-18 | Konica Minolta Business Technologies Inc | Image forming apparatus |
US20080004116A1 (en) * | 2006-06-30 | 2008-01-03 | Andrew Stephen Van Luchene | Video Game Environment |
WO2007047871A2 (en) * | 2005-10-17 | 2007-04-26 | Markmonitor Inc. | Client side brand protection |
JP4629555B2 (en) * | 2005-11-07 | 2011-02-09 | インターナショナル・ビジネス・マシーンズ・コーポレーション | Restoration device, program, information system, restoration method, storage device, storage system, and storage method |
JP4154421B2 (en) * | 2005-12-07 | 2008-09-24 | キヤノン株式会社 | Image processing apparatus, program for executing the image processing method, and medium storing the program |
WO2007089943A2 (en) * | 2006-02-01 | 2007-08-09 | Markmonitor Inc. | Detecting online abuse in images |
US20070203891A1 (en) * | 2006-02-28 | 2007-08-30 | Microsoft Corporation | Providing and using search index enabling searching based on a targeted content of documents |
US20070208715A1 (en) * | 2006-03-02 | 2007-09-06 | Thomas Muehlbauer | Assigning Unique Content Identifiers to Digital Media Content |
US20080059211A1 (en) | 2006-08-29 | 2008-03-06 | Attributor Corporation | Content monitoring and compliance |
-
2007
- 2007-01-19 US US11/655,748 patent/US8707459B2/en active Active
-
2008
- 2008-01-18 WO PCT/US2008/000707 patent/WO2008088888A1/en active Application Filing
-
2014
- 2014-05-06 US US14/271,297 patent/US8935745B2/en active Active
- 2014-11-14 US US14/541,422 patent/US9436810B2/en active Active
Patent Citations (275)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4677466A (en) | 1985-07-29 | 1987-06-30 | A. C. Nielsen Company | Broadcast program identification method and apparatus |
US5210820A (en) | 1990-05-02 | 1993-05-11 | Broadcast Data Systems Limited Partnership | Signal recognition system and method |
US20040022444A1 (en) | 1993-11-18 | 2004-02-05 | Rhoads Geoffrey B. | Authentication using a digital watermark |
US7113615B2 (en) | 1993-11-18 | 2006-09-26 | Digimarc Corporation | Watermark embedder and reader |
US20030231785A1 (en) | 1993-11-18 | 2003-12-18 | Rhoads Geoffrey B. | Watermark embedder and reader |
US6513018B1 (en) | 1994-05-05 | 2003-01-28 | Fair, Isaac And Company, Inc. | Method and apparatus for scoring the likelihood of a desired performance result |
US5629980A (en) | 1994-11-23 | 1997-05-13 | Xerox Corporation | System for controlling the distribution and use of digital works |
US5634012A (en) | 1994-11-23 | 1997-05-27 | Xerox Corporation | System for controlling the distribution and use of digital works having a fee reporting mechanism |
US20040107166A1 (en) | 1994-11-23 | 2004-06-03 | Contentguard Holding, Inc. | Usage rights grammar and digital works having usage rights created with the grammar |
US5715403A (en) | 1994-11-23 | 1998-02-03 | Xerox Corporation | System for controlling the distribution and use of digital works having attached usage rights where the usage rights are defined by a usage rights grammar |
US6236971B1 (en) | 1994-11-23 | 2001-05-22 | Contentguard Holdings, Inc. | System for controlling the distribution and use of digital works using digital tickets |
US5679938A (en) | 1994-12-02 | 1997-10-21 | Telecheck International, Inc. | Methods and systems for interactive check authorizations |
US5679940A (en) | 1994-12-02 | 1997-10-21 | Telecheck International, Inc. | Transaction system with on/off line risk assessment |
US5910987A (en) | 1995-02-13 | 1999-06-08 | Intertrust Technologies Corp. | Systems and methods for secure transaction management and electronic rights protection |
US20020048369A1 (en) | 1995-02-13 | 2002-04-25 | Intertrust Technologies Corp. | Systems and methods for secure transaction management and electronic rights protection |
US6430306B2 (en) | 1995-03-20 | 2002-08-06 | Lau Technologies | Systems and methods for identifying images |
US20030021441A1 (en) | 1995-07-27 | 2003-01-30 | Levy Kenneth L. | Connected audio and other media objects |
US6505160B1 (en) | 1995-07-27 | 2003-01-07 | Digimarc Corporation | Connected audio and other media objects |
US7047241B1 (en) | 1995-10-13 | 2006-05-16 | Digimarc Corporation | System and methods for managing digital creative works |
US5664018A (en) | 1996-03-12 | 1997-09-02 | Leighton; Frank Thomson | Watermarking process resilient to collusion attacks |
US5913205A (en) | 1996-03-29 | 1999-06-15 | Virage, Inc. | Query optimization for visual information retrieval system |
US6563950B1 (en) | 1996-06-25 | 2003-05-13 | Eyematic Interfaces, Inc. | Labeled bunch graphs for image analysis |
US5918223A (en) | 1996-07-22 | 1999-06-29 | Muscle Fish | Method and article of manufacture for content-based analysis, storage, retrieval, and segmentation of audio information |
US6647548B1 (en) | 1996-09-06 | 2003-11-11 | Nielsen Media Research, Inc. | Coded/non-coded program audience measurement system |
US5892536A (en) | 1996-10-03 | 1999-04-06 | Personal Audio | Systems and methods for computer enhanced broadcast monitoring |
US20010010756A1 (en) | 1997-01-23 | 2001-08-02 | Sony Corporation | Information signal output control method, information signal duplication prevention method, information signal duplication prevention device, and information signal recording medium |
US6295439B1 (en) | 1997-03-21 | 2001-09-25 | Educational Testing Service | Methods and systems for presentation and evaluation of constructed responses assessed by human evaluators |
US6035055A (en) | 1997-11-03 | 2000-03-07 | Hewlett-Packard Company | Digital image management system in a distributed data access network system |
US6091822A (en) | 1998-01-08 | 2000-07-18 | Macrovision Corporation | Method and apparatus for recording scrambled video audio signals and playing back said video signal, descrambled, within a secure environment |
US20050154924A1 (en) | 1998-02-13 | 2005-07-14 | Scheidt Edward M. | Multiple factor-based user identification and authentication |
US6121530A (en) | 1998-03-19 | 2000-09-19 | Sonoda; Tomonari | World Wide Web-based melody retrieval system with thresholds determined by using distribution of pitch and span of notes |
US6301370B1 (en) | 1998-04-13 | 2001-10-09 | Eyematic Interfaces, Inc. | Face recognition from video images |
US6292575B1 (en) | 1998-07-20 | 2001-09-18 | Lau Technologies | Real-time facial recognition and verification system |
US6574609B1 (en) | 1998-08-13 | 2003-06-03 | International Business Machines Corporation | Secure electronic content management system |
US6983371B1 (en) | 1998-10-22 | 2006-01-03 | International Business Machines Corporation | Super-distribution of protected digital content |
US20020031253A1 (en) | 1998-12-04 | 2002-03-14 | Orang Dialameh | System and method for feature location and tracking in multiple dimensions including depth |
US7421723B2 (en) | 1999-01-07 | 2008-09-02 | Nielsen Media Research, Inc. | Detection of media links in broadcast signals |
US8185967B2 (en) | 1999-03-10 | 2012-05-22 | Digimarc Corporation | Method and apparatus for content management |
US20060174348A1 (en) | 1999-05-19 | 2006-08-03 | Rhoads Geoffrey B | Watermark-based personal audio appliance |
US7302574B2 (en) | 1999-05-19 | 2007-11-27 | Digimarc Corporation | Content identifiers triggering corresponding responses through collaborative processing |
US7185201B2 (en) | 1999-05-19 | 2007-02-27 | Digimarc Corporation | Content identifiers triggering corresponding responses |
US20020028000A1 (en) | 1999-05-19 | 2002-03-07 | Conwell William Y. | Content identifiers triggering corresponding responses through collaborative processing |
US20050246752A1 (en) | 1999-08-03 | 2005-11-03 | Gad Liwerant | Method and system for sharing video over a network |
US6466695B1 (en) | 1999-08-04 | 2002-10-15 | Eyematic Interfaces, Inc. | Procedure for automatic analysis of images and image sequences based on two-dimensional shape primitives |
US6976165B1 (en) | 1999-09-07 | 2005-12-13 | Emc Corporation | System and method for secure storage, transfer and retrieval of content addressable information |
US7761465B1 (en) * | 1999-09-17 | 2010-07-20 | Sony Corporation | Data providing system and method therefor |
US7783489B2 (en) | 1999-09-21 | 2010-08-24 | Iceberg Industries Llc | Audio identification system and method |
US7870574B2 (en) | 1999-09-21 | 2011-01-11 | Iceberg Industries, Llc | Method and apparatus for automatically recognizing input audio and/or video streams |
US6795638B1 (en) | 1999-09-30 | 2004-09-21 | New Jersey Devils, Llc | System and method for recording and preparing statistics concerning live performances |
US20040071314A1 (en) | 1999-10-28 | 2004-04-15 | Yacov Yacobi | Methods and systems for fingerprinting digital data |
US6693236B1 (en) | 1999-12-28 | 2004-02-17 | Monkeymedia, Inc. | User interface for simultaneous management of owned and unowned inventory |
US6829368B2 (en) | 2000-01-26 | 2004-12-07 | Digimarc Corporation | Establishing and interacting with on-line media collections using identifiers in media signals |
US20050044189A1 (en) | 2000-02-17 | 2005-02-24 | Audible Magic Corporation. | Method and apparatus for identifying media content presented on a media playing device |
US6834308B1 (en) | 2000-02-17 | 2004-12-21 | Audible Magic Corporation | Method and apparatus for identifying media content presented on a media playing device |
US20020038296A1 (en) | 2000-02-18 | 2002-03-28 | Margolus Norman H. | Data repository and method for promoting network storage of data |
US7298864B2 (en) | 2000-02-19 | 2007-11-20 | Digimarc Corporation | Digital watermarks as a gateway and control mechanism |
US6810388B1 (en) | 2000-03-24 | 2004-10-26 | Trinity Security Systems, Inc. | Digital contents copying inhibition apparatus, digital contents copying inhibition method, and computer products |
US20030093790A1 (en) | 2000-03-28 | 2003-05-15 | Logan James D. | Audio and video program recording, editing and playback systems using metadata |
US20020052885A1 (en) | 2000-05-02 | 2002-05-02 | Levy Kenneth L. | Using embedded data with file sharing |
US6965889B2 (en) | 2000-05-09 | 2005-11-15 | Fair Isaac Corporation | Approach for generating rules |
US6684254B1 (en) | 2000-05-31 | 2004-01-27 | International Business Machines Corporation | Hyperlink filter for “pirated” and “disputed” copyright material on the internet in a method, system and program |
US20080052783A1 (en) | 2000-07-20 | 2008-02-28 | Levy Kenneth L | Using object identifiers with content distribution |
US20050193408A1 (en) | 2000-07-24 | 2005-09-01 | Vivcom, Inc. | Generating, transporting, processing, storing and presenting segmentation information for audio-visual programs |
US6772196B1 (en) | 2000-07-27 | 2004-08-03 | Propel Software Corp. | Electronic mail filtering system and methods |
WO2002011033A1 (en) | 2000-07-28 | 2002-02-07 | Copyright.Net Inc. | Apparatus and method for transmitting and keeping track of legal notices |
US6990453B2 (en) | 2000-07-31 | 2006-01-24 | Landmark Digital Services Llc | System and methods for recognizing sound and music signals in high noise and distortion |
US20020069370A1 (en) | 2000-08-31 | 2002-06-06 | Infoseer, Inc. | System and method for tracking and preventing illegal distribution of proprietary material over computer networks |
US7346472B1 (en) | 2000-09-07 | 2008-03-18 | Blue Spike, Inc. | Method and device for monitoring and analyzing signals |
US6597775B2 (en) | 2000-09-29 | 2003-07-22 | Fair Isaac Corporation | Self-learning real-time prioritization of telecommunication fraud control actions |
US20020082999A1 (en) | 2000-10-19 | 2002-06-27 | Cheol-Woong Lee | Method of preventing reduction of sales amount of records due to digital music file illegally distributed through communication network |
US6889383B1 (en) | 2000-10-23 | 2005-05-03 | Clearplay, Inc. | Delivery of navigation data for playback of audio and video content |
US20060031870A1 (en) | 2000-10-23 | 2006-02-09 | Jarman Matthew T | Apparatus, system, and method for filtering objectionable portions of a multimedia presentation |
US6898799B1 (en) | 2000-10-23 | 2005-05-24 | Clearplay, Inc. | Multimedia content navigation and playback |
US20020152215A1 (en) | 2000-10-25 | 2002-10-17 | Clark George Philip | Distributing electronic books over a computer network |
US20020082731A1 (en) | 2000-11-03 | 2002-06-27 | International Business Machines Corporation | System for monitoring audio content in a video broadcast |
US7562012B1 (en) | 2000-11-03 | 2009-07-14 | Audible Magic Corporation | Method and apparatus for creating a unique audio signature |
US20020178271A1 (en) | 2000-11-20 | 2002-11-28 | Graham Todd D. | Dynamic file access control and management |
US7043473B1 (en) | 2000-11-22 | 2006-05-09 | Widevine Technologies, Inc. | Media tracking system and method |
US7266704B2 (en) | 2000-12-18 | 2007-09-04 | Digimarc Corporation | User-friendly rights management systems and methods |
US20070294173A1 (en) | 2000-12-18 | 2007-12-20 | Levy Kenneth L | Rights Management System and Methods |
US6407680B1 (en) | 2000-12-22 | 2002-06-18 | Generic Media, Inc. | Distributed on-demand media transcoding system and method |
US6968328B1 (en) | 2000-12-29 | 2005-11-22 | Fair Isaac Corporation | Method and system for implementing rules and ruleflows |
US20020087885A1 (en) | 2001-01-03 | 2002-07-04 | Vidius Inc. | Method and application for a reactive defense against illegal distribution of multimedia content in file sharing networks |
US20120284803A1 (en) | 2001-01-17 | 2012-11-08 | Contentguard Holdings, Inc. | Method and apparatus for distributing enforceable property rights |
US7085741B2 (en) | 2001-01-17 | 2006-08-01 | Contentguard Holdings, Inc. | Method and apparatus for managing digital content usage rights |
WO2002065782A1 (en) | 2001-02-12 | 2002-08-22 | Koninklijke Philips Electronics N.V. | Generating and matching hashes of multimedia content |
US20020178410A1 (en) | 2001-02-12 | 2002-11-28 | Haitsma Jaap Andre | Generating and matching hashes of multimedia content |
US20020168082A1 (en) | 2001-03-07 | 2002-11-14 | Ravi Razdan | Real-time, distributed, transactional, hybrid watermarking method to provide trace-ability and copyright protection of digital content in peer-to-peer networks |
US7681032B2 (en) | 2001-03-12 | 2010-03-16 | Portauthority Technologies Inc. | System and method for monitoring unauthorized transport of digital content |
US7197459B1 (en) | 2001-03-19 | 2007-03-27 | Amazon Technologies, Inc. | Hybrid machine/human computing arrangement |
US7653552B2 (en) | 2001-03-21 | 2010-01-26 | Qurio Holdings, Inc. | Digital file marketplace |
US20020141578A1 (en) | 2001-03-29 | 2002-10-03 | Ripley Michael S. | Method and apparatus for content protection across a source-to-destination interface |
US20050154678A1 (en) | 2001-04-05 | 2005-07-14 | Audible Magic Corporation | Copyright detection and protection system and method |
US7363278B2 (en) | 2001-04-05 | 2008-04-22 | Audible Magic Corporation | Copyright detection and protection system and method |
US20030037010A1 (en) | 2001-04-05 | 2003-02-20 | Audible Magic, Inc. | Copyright detection and protection system and method |
US20050273617A1 (en) | 2001-04-24 | 2005-12-08 | Microsoft Corporation | Robust recognizer of perceptually similar content |
US20020165819A1 (en) | 2001-05-02 | 2002-11-07 | Gateway, Inc. | System and method for providing distributed computing services |
US20020174132A1 (en) | 2001-05-04 | 2002-11-21 | Allresearch, Inc. | Method and system for detecting unauthorized trademark use on the internet |
US7945600B1 (en) * | 2001-05-18 | 2011-05-17 | Stratify, Inc. | Techniques for organizing data to support efficient review and analysis |
US7194490B2 (en) | 2001-05-22 | 2007-03-20 | Christopher Zee | Method for the assured and enduring archival of intellectual property |
US7366787B2 (en) | 2001-06-08 | 2008-04-29 | Sun Microsystems, Inc. | Dynamic configuration of a content publisher |
WO2002103968A1 (en) | 2001-06-15 | 2002-12-27 | Beep Science As | An arrangement and a method for content policy control in a mobile multimedia messaging system |
US6944604B1 (en) | 2001-07-03 | 2005-09-13 | Fair Isaac Corporation | Mechanism and method for specified temporal deployment of rules within a rule server |
US20030023852A1 (en) | 2001-07-10 | 2003-01-30 | Wold Erling H. | Method and apparatus for identifying an unkown work |
US20030086341A1 (en) | 2001-07-20 | 2003-05-08 | Gracenote, Inc. | Automatic identification of sound recordings |
US7877438B2 (en) | 2001-07-20 | 2011-01-25 | Audible Magic Corporation | Method and apparatus for identifying new media content |
US20030033321A1 (en) | 2001-07-20 | 2003-02-13 | Audible Magic, Inc. | Method and apparatus for identifying new media content |
US20030018709A1 (en) | 2001-07-20 | 2003-01-23 | Audible Magic | Playlist generation method and apparatus |
US20030052768A1 (en) | 2001-09-17 | 2003-03-20 | Maune James J. | Security method and system |
US20030061490A1 (en) | 2001-09-26 | 2003-03-27 | Abajian Aram Christian | Method for identifying copyright infringement violations by fingerprint detection |
US20030135623A1 (en) | 2001-10-23 | 2003-07-17 | Audible Magic, Inc. | Method and apparatus for cache promotion |
US20040205030A1 (en) | 2001-10-24 | 2004-10-14 | Capital Confirmation, Inc. | Systems, methods and computer readable medium providing automated third-party confirmations |
US7117513B2 (en) | 2001-11-09 | 2006-10-03 | Nielsen Media Research, Inc. | Apparatus and method for detecting and correcting a corrupted broadcast time code |
US20040230529A1 (en) | 2001-11-20 | 2004-11-18 | Contentguard Holdings, Inc. | System and method for granting access to an item or permission to use an item based on configurable conditions |
US7020635B2 (en) | 2001-11-21 | 2006-03-28 | Line 6, Inc | System and method of secure electronic commerce transactions including tracking and recording the distribution and usage of assets |
US20030099379A1 (en) | 2001-11-26 | 2003-05-29 | Monk Bruce C. | Validation and verification apparatus and method |
US20030101104A1 (en) | 2001-11-28 | 2003-05-29 | Koninklijke Philips Electronics N.V. | System and method for retrieving information related to targeted subjects |
US20030115459A1 (en) | 2001-12-17 | 2003-06-19 | Monk Bruce C. | Document and bearer verification system |
US7711731B2 (en) | 2002-01-11 | 2010-05-04 | International Business Machines Corporation | Synthesizing information-bearing content from multiple channels |
US20050141707A1 (en) | 2002-02-05 | 2005-06-30 | Haitsma Jaap A. | Efficient storage of fingerprints |
US20050105726A1 (en) | 2002-04-12 | 2005-05-19 | Christian Neubauer | Method and device for embedding watermark information and method and device for extracting embedded watermark information |
US20050025335A1 (en) | 2002-04-18 | 2005-02-03 | Bloom Jeffrey Adam | Method and apparatus for providing an asymmetric watermark carrier |
US20030216824A1 (en) * | 2002-05-14 | 2003-11-20 | Docomo Communications Laboratories Usa, Inc. | Method and apparatus for self-degrading digital data |
US20030216988A1 (en) | 2002-05-17 | 2003-11-20 | Cassandra Mollett | Systems and methods for using phone number validation in a risk assessment |
US20050259819A1 (en) | 2002-06-24 | 2005-11-24 | Koninklijke Philips Electronics | Method for generating hashes from a compressed multimedia content |
US7003131B2 (en) | 2002-07-09 | 2006-02-21 | Kaleidescape, Inc. | Watermarking and fingerprinting digital content using alternative blocks to embed information |
US20040010602A1 (en) | 2002-07-10 | 2004-01-15 | Van Vleck Paul F. | System and method for managing access to digital content via digital rights policies |
US6871200B2 (en) | 2002-07-11 | 2005-03-22 | Forensic Eye Ltd. | Registration and monitoring system |
US20040064415A1 (en) | 2002-07-12 | 2004-04-01 | Abdallah David S. | Personal authentication software and systems for travel privilege assignation and verification |
US20070098172A1 (en) | 2002-07-16 | 2007-05-03 | Levy Kenneth L | Digital Watermarking Applications |
US20050273612A1 (en) | 2002-07-26 | 2005-12-08 | Koninklijke Philips Electronics N.V. | Identification of digital data sequences |
US20040054661A1 (en) | 2002-09-13 | 2004-03-18 | Dominic Cheung | Automated processing of appropriateness determination of content for search listings in wide area network searches |
US8090717B1 (en) | 2002-09-20 | 2012-01-03 | Google Inc. | Methods and apparatus for ranking documents |
US20040059953A1 (en) | 2002-09-24 | 2004-03-25 | Arinc | Methods and systems for identity management |
US20060171474A1 (en) | 2002-10-23 | 2006-08-03 | Nielsen Media Research | Digital data insertion apparatus and methods for use with compressed audio/video data |
US20040153663A1 (en) | 2002-11-01 | 2004-08-05 | Clark Robert T. | System, method and computer program product for assessing risk of identity theft |
US20060075237A1 (en) | 2002-11-12 | 2006-04-06 | Koninklijke Philips Electronics N.V. | Fingerprinting multimedia contents |
US20040213437A1 (en) | 2002-11-26 | 2004-10-28 | Howard James V | Systems and methods for managing and detecting fraud in image databases used with identification documents |
US20060050993A1 (en) | 2002-12-19 | 2006-03-09 | Stentiford Frederick W | Searching images |
US20060212927A1 (en) | 2002-12-20 | 2006-09-21 | Kabushiki Kaisha Toshiba | Content management system, recording medium and method |
US7370017B1 (en) | 2002-12-20 | 2008-05-06 | Microsoft Corporation | Redistribution of rights-managed content and technique for encouraging same |
US20100191819A1 (en) | 2003-01-24 | 2010-07-29 | Aol Inc. | Group Based Spam Classification |
US20040221118A1 (en) | 2003-01-29 | 2004-11-04 | Slater Alastair Michael | Control of access to data content for read and/or write operations |
US20040163106A1 (en) | 2003-02-01 | 2004-08-19 | Audible Magic, Inc. | Method and apparatus to identify a work received by a processing system |
US20050102515A1 (en) | 2003-02-03 | 2005-05-12 | Dave Jaworski | Controlling read and write operations for digital media |
US20040243567A1 (en) | 2003-03-03 | 2004-12-02 | Levy Kenneth L. | Integrating and enhancing searching of media content and biometric databases |
US20060218126A1 (en) | 2003-03-13 | 2006-09-28 | Hendrikus Albertus De Ruijter | Data retrieval method and system |
US20040189441A1 (en) | 2003-03-24 | 2004-09-30 | Kosmas Stergiou | Apparatus and methods for verification and authentication employing voluntary attributes, knowledge management and databases |
US20040245330A1 (en) | 2003-04-03 | 2004-12-09 | Amy Swift | Suspicious persons database |
US20040230527A1 (en) | 2003-04-29 | 2004-11-18 | First Data Corporation | Authentication for online money transfers |
US20060230358A1 (en) | 2003-05-02 | 2006-10-12 | Jorn Sacher | System for inspecting a printed image |
US20040255147A1 (en) | 2003-05-06 | 2004-12-16 | Vidius Inc. | Apparatus and method for assuring compliance with distribution and usage policy |
US20040225645A1 (en) | 2003-05-06 | 2004-11-11 | Rowney Kevin T. | Personal computing device -based mechanism to detect preselected data |
US20040267552A1 (en) | 2003-06-26 | 2004-12-30 | Contentguard Holdings, Inc. | System and method for controlling rights expressions by stakeholders of an item |
US20050008225A1 (en) | 2003-06-27 | 2005-01-13 | Hiroyuki Yanagisawa | System, apparatus, and method for providing illegal use research service for image data, and system, apparatus, and method for providing proper use research service for image data |
US20050039057A1 (en) | 2003-07-24 | 2005-02-17 | Amit Bagga | Method and apparatus for authenticating a user using query directed passwords |
US20050043960A1 (en) | 2003-08-19 | 2005-02-24 | David Blankley | System and automate the licensing, re-use and royalties of authored content in derivative works |
US20050043548A1 (en) | 2003-08-22 | 2005-02-24 | Joseph Cates | Automated monitoring and control system for networked communications |
US20050060643A1 (en) * | 2003-08-25 | 2005-03-17 | Miavia, Inc. | Document similarity detection and classification system |
US20050080846A1 (en) | 2003-09-27 | 2005-04-14 | Webhound, Inc. | Method and system for updating digital content over a network |
US20080002854A1 (en) | 2003-10-08 | 2008-01-03 | Verance Corporation | Signal continuity assessment using embedded watermarks |
US20070124251A1 (en) * | 2003-10-16 | 2007-05-31 | Sharp Kabushiki Kaisha | Content use control device, reording device, reproduction device, recording medium, and content use control method |
US7314162B2 (en) | 2003-10-17 | 2008-01-01 | Digimore Corporation | Method and system for reporting identity document usage |
US20070033397A1 (en) * | 2003-10-20 | 2007-02-08 | Phillips Ii Eugene B | Securing digital content system and method |
US20060277564A1 (en) | 2003-10-22 | 2006-12-07 | Jarman Matthew T | Apparatus and method for blocking audio/visual programming and for muting audio |
US20070101360A1 (en) | 2003-11-17 | 2007-05-03 | Koninklijke Philips Electronics, N.V. | Commercial insertion into video streams based on surrounding program content |
US20070110089A1 (en) * | 2003-11-27 | 2007-05-17 | Advestigo | System for intercepting multimedia documents |
US20050125358A1 (en) | 2003-12-04 | 2005-06-09 | Black Duck Software, Inc. | Authenticating licenses for legally-protectable content based on license profiles and content identifiers |
US20050192902A1 (en) * | 2003-12-05 | 2005-09-01 | Motion Picture Association Of America | Digital rights management using multiple independent parameters |
US20050132235A1 (en) | 2003-12-15 | 2005-06-16 | Remco Teunen | System and method for providing improved claimant authentication |
US20050171851A1 (en) | 2004-01-30 | 2005-08-04 | Applebaum Ted H. | Multiple choice challenge-response user authorization system and method |
US20060010500A1 (en) * | 2004-02-03 | 2006-01-12 | Gidon Elazar | Protection of digital data content |
US20050193016A1 (en) | 2004-02-17 | 2005-09-01 | Nicholas Seet | Generation of a media content database by correlating repeating media content in media streams |
US20060240862A1 (en) | 2004-02-20 | 2006-10-26 | Hartmut Neven | Mobile image-based information retrieval system |
US20070198426A1 (en) | 2004-03-04 | 2007-08-23 | Yates James M | Method and apparatus for digital copyright exchange |
US20070083883A1 (en) | 2004-03-29 | 2007-04-12 | Deng Kevin K | Methods and apparatus to detect a blank frame in a digital video broadcast signal |
US20050222900A1 (en) | 2004-03-30 | 2005-10-06 | Prashant Fuloria | Selectively delivering advertisements based at least in part on trademark issues |
US20060059561A1 (en) | 2004-04-14 | 2006-03-16 | Digital River, Inc. | Electronic storefront that limits download of software wrappers based on geographic location |
US20080154401A1 (en) | 2004-04-19 | 2008-06-26 | Landmark Digital Services Llc | Method and System For Content Sampling and Identification |
US20050288952A1 (en) | 2004-05-18 | 2005-12-29 | Davis Bruce L | Official documents and methods of issuance |
US20070234213A1 (en) | 2004-06-07 | 2007-10-04 | Jason Krikorian | Selection and Presentation of Context-Relevant Supplemental Content And Advertising |
US20070168543A1 (en) | 2004-06-07 | 2007-07-19 | Jason Krikorian | Capturing and Sharing Media Content |
US20060115108A1 (en) | 2004-06-22 | 2006-06-01 | Rodriguez Tony F | Metadata management and generation using digital watermarks |
US7707427B1 (en) | 2004-07-19 | 2010-04-27 | Michael Frederick Kenrich | Multi-level file digests |
US20060034177A1 (en) | 2004-07-28 | 2006-02-16 | Audible Magic Corporation | System for distributing decoy content in a peer to peer network |
US20060080356A1 (en) | 2004-10-13 | 2006-04-13 | Microsoft Corporation | System and method for inferring similarities between media objects |
US20060085816A1 (en) | 2004-10-18 | 2006-04-20 | Funk James M | Method and apparatus to control playback in a download-and-view video on demand system |
US20060177198A1 (en) | 2004-10-20 | 2006-08-10 | Jarman Matthew T | Media player configured to receive playback filters from alternative storage mediums |
US20100017487A1 (en) | 2004-11-04 | 2010-01-21 | Vericept Corporation | Method, apparatus, and system for clustering and classification |
US20060106725A1 (en) | 2004-11-12 | 2006-05-18 | International Business Machines Corporation | Method, system, and program product for visual display of a license status for a software program |
US20060106675A1 (en) | 2004-11-16 | 2006-05-18 | Cohen Peter D | Providing an electronic marketplace to facilitate human performance of programmatically submitted tasks |
US20060106774A1 (en) | 2004-11-16 | 2006-05-18 | Cohen Peter D | Using qualifications of users to facilitate user performance of tasks |
US7881957B1 (en) | 2004-11-16 | 2011-02-01 | Amazon Technologies, Inc. | Identifying tasks for task performers based on task subscriptions |
US20060112015A1 (en) | 2004-11-24 | 2006-05-25 | Contentguard Holdings, Inc. | Method, system, and device for handling creation of derivative works and for adapting rights to derivative works |
US20060110137A1 (en) | 2004-11-25 | 2006-05-25 | Matsushita Electric Industrial Co., Ltd. | Video and audio data transmitting apparatus, and video and audio data transmitting method |
US20130022333A1 (en) | 2004-12-03 | 2013-01-24 | Nec Corporation | Video content playback assistance method, video content playback assistance system, and information distribution program |
US20070211174A1 (en) | 2005-01-05 | 2007-09-13 | Daniel Putterman | Windows management in a television environment |
US20070180537A1 (en) | 2005-01-07 | 2007-08-02 | Shan He | Method for fingerprinting multimedia content |
US20080152146A1 (en) | 2005-01-24 | 2008-06-26 | Koninklijke Philips Electronics, N.V. | Private and Controlled Ownership Sharing |
US20070061393A1 (en) | 2005-02-01 | 2007-03-15 | Moore James F | Management of health care data |
US7369677B2 (en) | 2005-04-26 | 2008-05-06 | Verance Corporation | System reactions to the detection of embedded watermarks in a digital host content |
US20070242880A1 (en) | 2005-05-18 | 2007-10-18 | Stebbings David W | System and method for the identification of motional media of widely varying picture content |
US20070154190A1 (en) * | 2005-05-23 | 2007-07-05 | Gilley Thomas S | Content tracking for movie segment bookmarks |
US20060287916A1 (en) | 2005-06-15 | 2006-12-21 | Steven Starr | Media marketplaces |
US20070130015A1 (en) | 2005-06-15 | 2007-06-07 | Steven Starr | Advertisement revenue sharing for distributed video |
US20080109306A1 (en) | 2005-06-15 | 2008-05-08 | Maigret Robert J | Media marketplaces |
US20060287996A1 (en) | 2005-06-16 | 2006-12-21 | International Business Machines Corporation | Computer-implemented method, system, and program product for tracking content |
US20070028308A1 (en) | 2005-07-29 | 2007-02-01 | Kosuke Nishio | Decoding apparatus |
US20070038567A1 (en) | 2005-08-12 | 2007-02-15 | Jeremy Allaire | Distribution of content |
US20070175998A1 (en) | 2005-09-01 | 2007-08-02 | Lev Zvi H | System and method for reliable content access using a cellular/wireless device with imaging capabilities |
US20070058925A1 (en) | 2005-09-14 | 2007-03-15 | Fu-Sheng Chiu | Interactive multimedia production |
US20070130177A1 (en) | 2005-09-23 | 2007-06-07 | Tina Schneider | Media management system |
US7529659B2 (en) | 2005-09-28 | 2009-05-05 | Audible Magic Corporation | Method and apparatus for identifying an unknown work |
US20070094145A1 (en) | 2005-10-24 | 2007-04-26 | Contentguard Holdings, Inc. | Method and system to support dynamic rights and resources sharing |
US20070110010A1 (en) | 2005-11-14 | 2007-05-17 | Sakari Kotola | Portable local server with context sensing |
US20070208751A1 (en) | 2005-11-22 | 2007-09-06 | David Cowan | Personalized content control |
US20070124756A1 (en) | 2005-11-29 | 2007-05-31 | Google Inc. | Detecting Repeating Content in Broadcast Media |
US20070130580A1 (en) | 2005-11-29 | 2007-06-07 | Google Inc. | Social and Interactive Applications for Mass Media |
US20070208711A1 (en) | 2005-12-21 | 2007-09-06 | Rhoads Geoffrey B | Rules Driven Pan ID Metadata Routing System and Network |
US20070192352A1 (en) | 2005-12-21 | 2007-08-16 | Levy Kenneth L | Content Metadata Directory Services |
US8341412B2 (en) | 2005-12-23 | 2012-12-25 | Digimarc Corporation | Methods for identifying audio or video content |
US20070162761A1 (en) | 2005-12-23 | 2007-07-12 | Davis Bruce L | Methods and Systems to Help Detect Identity Fraud |
US8458482B2 (en) | 2005-12-23 | 2013-06-04 | Digimarc Corporation | Methods for identifying audio or video content |
US8688999B2 (en) | 2005-12-23 | 2014-04-01 | Digimarc Corporation | Methods for identifying audio or video content |
US8868917B2 (en) | 2005-12-23 | 2014-10-21 | Digimarc Corporation | Methods for identifying audio or video content |
US20070156594A1 (en) | 2006-01-03 | 2007-07-05 | Mcgucken Elliot | System and method for allowing creators, artsists, and owners to protect and profit from content |
US20080317278A1 (en) | 2006-01-16 | 2008-12-25 | Frederic Lefebvre | Method for Computing a Fingerprint of a Video Sequence |
US20090083228A1 (en) | 2006-02-07 | 2009-03-26 | Mobixell Networks Ltd. | Matching of modified visual and audio media |
US20070203911A1 (en) | 2006-02-07 | 2007-08-30 | Fu-Sheng Chiu | Video weblog |
US20080027931A1 (en) | 2006-02-27 | 2008-01-31 | Vobile, Inc. | Systems and methods for publishing, searching, retrieving and binding metadata for a digital object |
US20070220575A1 (en) | 2006-03-03 | 2007-09-20 | Verimatrix, Inc. | Movie studio-based network distribution system and method |
US20070253594A1 (en) | 2006-04-28 | 2007-11-01 | Vobile, Inc. | Method and system for fingerprinting digital video object based on multiresolution, multirate spatial and temporal signatures |
US20080034396A1 (en) | 2006-05-30 | 2008-02-07 | Lev Zvi H | System and method for video distribution and billing |
US20070282472A1 (en) | 2006-06-01 | 2007-12-06 | International Business Machines Corporation | System and method for customizing soundtracks |
US7831531B1 (en) | 2006-06-22 | 2010-11-09 | Google Inc. | Approximate hashing functions for finding similar content |
US7899694B1 (en) | 2006-06-30 | 2011-03-01 | Amazon Technologies, Inc. | Generating solutions to problems via interactions with human responders |
US20080005241A1 (en) | 2006-06-30 | 2008-01-03 | Searete Llc, A Limited Liability Corporation Of The State Of Delaware | Usage parameters for communication content |
US20080051029A1 (en) | 2006-08-25 | 2008-02-28 | Bradley James Witteman | Phone-based broadcast audio identification |
US8010511B2 (en) | 2006-08-29 | 2011-08-30 | Attributor Corporation | Content monitoring and compliance enforcement |
US9342670B2 (en) | 2006-08-29 | 2016-05-17 | Attributor Corporation | Content monitoring and host compliance evaluation |
US20150302886A1 (en) | 2006-08-29 | 2015-10-22 | Digimarc Corporation | Determination of originality of content |
US9031919B2 (en) | 2006-08-29 | 2015-05-12 | Attributor Corporation | Content monitoring and compliance enforcement |
US20150074748A1 (en) | 2006-08-29 | 2015-03-12 | Digimarc Corporation | Content monitoring and host compliance evaluation |
US20080059461A1 (en) | 2006-08-29 | 2008-03-06 | Attributor Corporation | Content search using a provided interface |
US8738749B2 (en) | 2006-08-29 | 2014-05-27 | Digimarc Corporation | Content monitoring and host compliance evaluation |
US20080059536A1 (en) | 2006-08-29 | 2008-03-06 | Attributor Corporation | Content monitoring and host compliance evaluation |
US20120011105A1 (en) | 2006-08-29 | 2012-01-12 | Attributor Corporation | Content monitoring and compliance enforcement |
US7730316B1 (en) | 2006-09-22 | 2010-06-01 | Fatlens, Inc. | Method for document fingerprinting |
US20080082405A1 (en) | 2006-09-29 | 2008-04-03 | Yahoo! Inc. | Digital media benefit attachment mechanism |
US7945470B1 (en) | 2006-09-29 | 2011-05-17 | Amazon Technologies, Inc. | Facilitating performance of submitted tasks by mobile task performers |
US20090144325A1 (en) | 2006-11-03 | 2009-06-04 | Franck Chastagnol | Blocking of Unlicensed Audio Content in Video Files on a Video Hosting Website |
US20140020116A1 (en) | 2006-11-03 | 2014-01-16 | Google Inc. | Blocking of unlicensed audio content in video files on a video hosting website |
US20080109369A1 (en) | 2006-11-03 | 2008-05-08 | Yi-Ling Su | Content Management System |
US20080162228A1 (en) | 2006-12-19 | 2008-07-03 | Friedrich Mechbach | Method and system for the integrating advertising in user generated contributions |
US20130085825A1 (en) | 2006-12-20 | 2013-04-04 | Digimarc Corp. | Method and system for determining content treatment |
US20080154739A1 (en) | 2006-12-22 | 2008-06-26 | Yahoo! Inc | Social Network Commerce Model |
US20080155701A1 (en) | 2006-12-22 | 2008-06-26 | Yahoo! Inc. | Method and system for unauthorized content detection and reporting |
US20130006946A1 (en) | 2006-12-22 | 2013-01-03 | Commvault Systems, Inc. | System and method for storing redundant information |
US20080162449A1 (en) | 2006-12-28 | 2008-07-03 | Chen Chao-Yu | Dynamic page similarity measurement |
US20080165960A1 (en) | 2007-01-09 | 2008-07-10 | Tagstory Co., Ltd. | System for providing copyright-protected video data and method thereof |
US8707459B2 (en) | 2007-01-19 | 2014-04-22 | Digimarc Corporation | Determination of originality of content |
US20080178302A1 (en) | 2007-01-19 | 2008-07-24 | Attributor Corporation | Determination of originality of content |
US20080209502A1 (en) | 2007-02-27 | 2008-08-28 | Seidel Craig H | Associating rights to multimedia content |
US20080249961A1 (en) | 2007-03-22 | 2008-10-09 | Harkness David H | Digital rights management and audience measurement systems and methods |
US20080240490A1 (en) | 2007-03-30 | 2008-10-02 | Microsoft Corporation | Source authentication and usage tracking of video |
US20090006225A1 (en) | 2007-06-29 | 2009-01-01 | Microsoft Corporation | Distribution channels and monetizing |
US20090030651A1 (en) | 2007-07-27 | 2009-01-29 | Audible Magic Corporation | System for identifying content of digital data |
WO2009017049A1 (en) | 2007-07-27 | 2009-02-05 | Aisin Seiki Kabushiki Kaisha | Door handle device |
US20090052784A1 (en) | 2007-08-22 | 2009-02-26 | Michele Covell | Detection And Classification Of Matches Between Time-Based Media |
US20090119169A1 (en) | 2007-10-02 | 2009-05-07 | Blinkx Uk Ltd | Various methods and apparatuses for an engine that pairs advertisements with video files |
US20090129755A1 (en) | 2007-11-21 | 2009-05-21 | Shlomo Selim Rakib | Method and Apparatus for Generation, Distribution and Display of Interactive Video Content |
US20090144772A1 (en) | 2007-11-30 | 2009-06-04 | Google Inc. | Video object tag creation and processing |
US20090165031A1 (en) | 2007-12-19 | 2009-06-25 | At&T Knowledge Ventures, L.P. | Systems and Methods to Identify Target Video Content |
WO2009100093A1 (en) | 2008-02-05 | 2009-08-13 | Dolby Laboratories Licensing Corporation | Associating information with media content |
US20090313078A1 (en) | 2008-06-12 | 2009-12-17 | Cross Geoffrey Mark Timothy | Hybrid human/computer image processing method |
Non-Patent Citations (54)
Title |
---|
Allamanche et al. "Content-Based Identification of Audio Material Using Mpeg-7 Low Level Description," in Proc. of the Int. Symp. of Music Information Retrieval, Indiana, USA, Oct. 2001. |
Amazon Mechanical Turk Developer Guide, 2006, 165 pp., API Version Oct. 31, 2006. |
Amazon Mechanical Turk Developer Guide, Dec. 16, 2005, 94 pp. |
Amazon Mechanical Turk Release Notes, Release Date Oct. 13, 2005 (Oct. 2005). |
Amazon's Mechanical Turk, thread from SlashDot, started Nov. 4, 2005. |
Apr. 2, 2014 Issue Notification; Nov. 25, 2013 Notice of Allowance and Interview Summary; Nov. 5, 2013 Amendment and Interview summary; Jun. 6, 2013 Non-final Office Action; Jun. 11, 2012 Amendment submitted with RCE; Feb. 10, 2012 final Rejection; Dec. 19, 2011 Amendment; Sep. 19, 2011 non-final Rejection; Nov. 1, 2010 Amendment submitted with RCE; Jun. 30, 2010 final Rejection; May 17, 2010 Amendment; Jan. 11, 2010 non-final Rejection; all from assignee's U.S. Appl. No. 11/655,748 (issued as U.S. Pat. No. 8,707,459). |
Apr. 27, 2016 Issue Notification; Jan. 20, 2016 Notice of Allowance and Interview Summary; Sep. 11, 2015 Amendment and Terminal Disclaimer; Jun. 11, 2015 Non-final Office Action; all from assignee's U.S. Appl. No. 14/288,124 (published as US 2015-0074748 A1). |
Assignee's U.S. Appl. No. 13/909,834, filed Jun. 4, 2013. |
Assignee's U.S. Appl. No. 13/937,995, filed Jul. 9, 2013. |
Baluja et al, Content Fingerprinting Using Wavelets, 3rd European Conference on Visual Media Production, Nov. 2006. |
Batlle et al., "Automatic Song Identification in Noisy Broadcast Audio," in Proc. of the SIP, Aug. 2002. |
Boutin, Crowdsourcing, Consumers as Creators, Business Week, Jul. 13, 2006. |
Brin, et al, Copy detection mechanisms for digital documents, ACM SIGMOD Record, vol. 24, No. 2, 1995. |
Brin, et al., "Copy Detection Mechanisms for Digital Documents," SIGMOD '95, Proceedings of the 1995 ACM SIGMOD international conference on Management of data, pp. 398-409, 1995. |
Cano et al, "A Review of Audio Fingerprinting," Journal of VLSI Signal Processing, 41, 271, 272, 2005. |
Cheung, et al, Efficient Video Similarity Measurement with Video Signature, IEEE Trans. on Circuits and Systems for Video Technology, 13.1, pp. 59-74, 2003. |
Covell et al, Advertisement Detection and Replacement using Acoustic and Visual Repetition, IEEE Int'l Workshop on Multimedia Signal Processing, Oct. 2006. |
Crowdsourcing article from Wikipedia, Dec. 18, 2006. |
DRM Watch, Guba Introduces Video Fingerprint Filtering, Jul. 27, 2006, http://www.drmwatch.com/article.php/3623091. |
Email about the product "Sentinel", Nov. 14, 2006. |
Fink et al, Social and interactive television applications based on real-time ambient audio identification, 4th European Interactive TV Conference, May 2006, pp. 138-146. |
Ghias, et al., "Query by Humming: Musical Information Retrieval in an Audio Database," ACM Multimedia, pp. 231-236, Nov. 1995. |
Global File Registry, Technical White Paper, Draft 1-26, May 2006. |
Haitsma, et al, "A Highly Robust Audio Fingerprinting System," Proc. Intl Conf on Music Information Retrieval, 2002. |
Heintze, "Scalable Document Fingerprinting (Extended Abstract)," Bell Laboratories, 1996. |
Hoad, et al, Methods for Identifying Versioned and Plagiarized Documents, Journal of the American Society for Information Science and Technology, 54.3, pp. 203-215, 2003. |
Howe et al, Crowdsourcing blog at crowdsourcing,typepad.com, Dec. 17, 2006, as retrieved from web.archive.org on Apr. 23, 2008.pdf. |
Howe et al, Crowdsourcing blog at crowsourcing.typepad.com, Nov. 7, 2006, as retrieved from web.archive.org on Apr. 23, 2008.pdf. |
Howe, Look Who's Crowdsourcing, Wired Magazine, Jun. 2006. |
Howe, The Rise of Crowdsourcing, Wired Magazine, Jun. 2006. |
Jul. 3, 2014 Notice of Appeal, Jan. 3, 2014 final Office Action, and Dec. 3, 2013 Amendment; all from assignee's U.S. Appl. No. 13/686,541 (published as 2013-0085825). |
Jul. 9, 2014 Office Action, and application specification and drawings with a filed of May 6, 2014; all from assignee's U.S. Appl. No. 14/271,297. |
Kageyama et al, "Melody Retrieval with Humming," Proceedings of Int. Computer Music Conference (ICMC), 1993. |
Kalker et al, "Robust Identification of Audio Using Watermarking and Fingerprinting," in Multimedia Security Handbook, CRC Press, 2005. |
Ke, et al, Efficient Near-Duplicate Detection and Sub-Image Retrieval, Intel Research Report IRP-TR-04-03, 2004. |
Konstantinou , et al., "A Dynamic JAVA-Based Intelligent Interface for Online Image Database Searches," VISUAL'99, LNCS 1614, pp. 211-220, 1999. |
Krikorian, U.S. Appl. No. 60/823,066, filed Aug. 21, 2006 (priority document for US20070168543). |
Liu et al, U.S. Appl. No. 60/856,501, filed Nov. 3, 2006, entitled "Rights Management" (which serves as a priority application for published US application 20080109369). |
Manber, "Finding Similar Files in a Large File System," Jan. 1994 Winter USENIX Technical Conference, 10 pages (as submitted Oct. 1993). |
Mar. 11, 2016 non-final Office Action; Jan. 14, 2016 Office of Petitions Decision; May 27, 2015 Petition for review by the Office of Petitions; Apr. 22, 2014 claims as filed; all from assignee's U.S. Appl. No. 14/258,633 (published as US 2015-0302886 A1). |
May 7, 2014 Issue Notification; Jan. 15, 2014 Notice of Allowance and Interview Summary; Dec. 12, 2013 Amendment; Jun. 14, 2013 Non-final Office Action; Jan. 12, 2012 Amendment submitted with RCE; Oct. 13, 2011 Final Rejection; Aug. 24, 2011 Examiner's Interview Summary; Aug. 15, 2011 Amendment; Apr. 13, 2011 non-final Rejection; Feb. 23, 2011 Amendment submitted with RCE; Dec. 21, 2010 final Rejection; Oct. 1, 2010 Amendment; Jun. 25, 2010 non-final Rejection; Apr. 23, 2010 Examiner's interview summary; Apr. 19, 2010 Amendment; Jan. 13, 2010 non-final Office Action; Dec. 7, 2009 Amendment submitted with RCE; Sep. 3, 2009 final Rejection; Jun. 22, 2009 Amendment; Mar. 17, 2009 non-final Rejection; all from assignee's U.S. Appl. No. 11/512,067 (issued as U.S. Pat. No. 8,738,749). |
Meckbach, U.S. Appl. No. 60/870,817, filed Dec. 19, 2006 (priority document for US20080162228). |
Nov. 21, 2014 Notice of Allowance, Notice of Allowability and Interview Summary; Oct. 29, 2014 Amendment including Interview Summary; Sep. 9, 2014 Interview Summary; Sep. 4, 2014 Amendment including Interview Summary; Jul. 9, 2014 non-final Office Action; all from U.S. Appl. No. 14/271,297. |
Popescu, et al, Exposing Digital Forgeries in Color Filter Array Interpolated Images, IEEE Trans. on Signal Processing 53.10, pp. 3948-3959, 2005. |
Prosecution excerpts of U.S. Appl. No. 14/271,297 (now U.S. Pat. No. 8,935,745), including applicant submissions dated May 6, 2014, Sep. 4, 2014, and Oct. 29, 2014, and Office communications dated Jul. 9, 2014, Sep. 9, 2014 and Nov. 21, 2014. |
Prosecution excerpts of U.S. Appl. No. 14/541,422 (published as US20150074833), including applicant submissions dated Nov. 14, 2014, Jul. 23, 2015, Mar. 10, 2016, Mar. 17, 2016, and Office communications dated Jan. 23, 2015, Oct. 15, 2015, and Mar. 25, 2016. |
Release-Amazon Mechanical Turk on Dec. 1, 2006. |
U.S. Appl. No. 60/740,760, filed Nov. 29, 2005 (from which published application 20070124756 claims priority). |
U.S. Appl. No. 60/771,536, filed Feb. 7, 2006 (from which published application US20090083228 claims priority). |
U.S. Appl. No. 60/818,182, filed Jun. 30, 2006 (from which U.S. Pat. No. 7,831,531 claims priority). |
U.S. Appl. No. 60/822,483 (from which published application US20080109306 claims priority). |
U.S. Appl. No. 60/823,066, filed Aug. 21, 2006. |
Von Ahn, Human Computation, CMU PhD Thesis, Dec. 7, 2005, 87pp. |
Wold et al., "Content-Based Classification, Search and Retrieval of Audio," IEEE MultiMedia 1996. |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10735381B2 (en) | 2006-08-29 | 2020-08-04 | Attributor Corporation | Customized handling of copied content based on owner-specified similarity thresholds |
US20160196342A1 (en) * | 2015-01-06 | 2016-07-07 | Inha-Industry Partnership | Plagiarism Document Detection System Based on Synonym Dictionary and Automatic Reference Citation Mark Attaching System |
US12141882B2 (en) | 2019-11-19 | 2024-11-12 | Google Llc | Methods, systems, and media for rights management of embedded sound recordings using composition clustering |
Also Published As
Publication number | Publication date |
---|---|
WO2008088888A1 (en) | 2008-07-24 |
US8707459B2 (en) | 2014-04-22 |
US20150074833A1 (en) | 2015-03-12 |
US8935745B2 (en) | 2015-01-13 |
US20140259097A1 (en) | 2014-09-11 |
US20080178302A1 (en) | 2008-07-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10735381B2 (en) | 2020-08-04 | Customized handling of copied content based on owner-specified similarity thresholds |
US9436810B2 (en) | 2016-09-06 | Determination of copied content, including attribution |
US9842200B1 (en) | 2017-12-12 | Content monitoring and host compliance evaluation |
US8010511B2 (en) | 2011-08-30 | Content monitoring and compliance enforcement |
US20080059461A1 (en) | 2008-03-06 | Content search using a provided interface |
US20080059211A1 (en) | 2008-03-06 | Content monitoring and compliance |
US20100268628A1 (en) | 2010-10-21 | Managing controlled content on a web page having revenue-generating code |
US20080059425A1 (en) | 2008-03-06 | Compliance information retrieval |
Urban et al. | 2005 | Efficient Process or" Chilling Effects"? Takedown Notices Under Section 512 of the Digital Millennium Copyright Act |
Grimmelmann | 2007 | The structure of search engine law |
Sag | 2009 | Copyright and copy-reliant technology |
US8644646B2 (en) | 2014-02-04 | Automatic identification of digital content related to a block of text, such as a blog entry |
US7996882B2 (en) | 2011-08-09 | Digital asset distribution system |
US8073828B2 (en) | 2011-12-06 | Licensed rights clearance and tracking for digital assets |
US20080228733A1 (en) | 2008-09-18 | Method and System for Determining Content Treatment |
Huang | 2017 | Comparison of e-commerce regulations in Chinese and American ftas: Converging approaches, diverging contents, and polycentric directions? |
Ginsburg et al. | 2018 | Embedding Content or Interring Copyright: Does the Internet Need the Server Rule |
Major | 1998 | Copyright law tackles yet another challenge: The Electronic frontier of the World Wide Web |
Kingsbury | 2018 | Copyright paste: The unfairness of sticking to transformative use in the digital age |
WO2008027365A2 (en) | 2008-03-06 | Content monitoring and compliance |
Lipinski et al. | 2013 | Look before You License: The Use of Public Sharing Websites in Building Co-Created Community Repositories |
Choraś et al. | 2023 | Not Only Security and Privacy: The Evolving Ethical and Legal Challenges of E-Commerce |
Feng | 2024 | Business Proposal of online copyright protection platform for digital assets |
Dinh | 2008 | Click Here to Share-The Impact of the Veoh Litigations on Viacom v. YouTube |
Li | 2002 | Network Copyright Rules in the People's Republic of China |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
2016-08-17 | STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
2018-08-21 | AS | Assignment |
Owner name: ATTRIBUTOR CORPORATION, OREGON Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BROCK, JAMES L.;PITKOW, JAMES E.;REEL/FRAME:046652/0954 Effective date: 20060928 Owner name: ATTRIBUTOR CORPORATION, OREGON Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BROCK, JAMES L.;PITKOW, JAMES E.;REEL/FRAME:046653/0563 Effective date: 20070309 |
2020-02-18 | MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |
2022-01-19 | AS | Assignment |
Owner name: LEMON INC., CAYMAN ISLANDS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:ATTRIBUTOR CORPORATION;REEL/FRAME:058700/0087 Effective date: 20211231 |
2024-02-01 | MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 8 |