Friday, August 6, 2021

Summarizer aims to shorten your daily news by leveraging AI


Introduction

In this age of globalization, time is of the essence for everyone. At this time that digital currency has been introduced to reduce transaction time. The cryptocurrency was originally known as a payment system that allows people to be able to make transactions very fast, without third parties, in a crystalline, secure, and faceless manner. With the accrual and development of the Crypto or blockchain ecosystem, several alternative investment opportunities have flourished, and have proven to be more efficient and profitable investment tools than traditional financial returns. Cryptocurrency is potentially the largest digital asset for investment. because it is user-friendly, secure, and allows to cut down steep transaction costs. It is the maximum thing in the financial market that has proven to be an interruption lifter in financial transactions worldwide. Leveraging blockchain technology, cryptocurrency has managed to set up a decentralized, transparent, and inaccessible accountable system


It is a fully automated newspaper.

Summarizer aims to make your daily news shorter by utilizing AI. Its bots crawl the web for news, summarize them, and then sort them into categories.

What is SUMMARIZER ?

Summarizer is exclusive to $SMR holders. You won’t have to pay anything, just simply holding $SMR to read Summarizer contents. At any time, you decide to stop reading Summarizer, you can just sell your $SMR back to the market.


The algorithm we use

TextRank is an unsupervised algorithm for the automated summarization of texts that can also be used to obtain the most important keywords in a document. The algorithm applies a variation of PageRank over a graph constructed specifically for the task of summarization. This produces a ranking of the elements in the graph: the most important elements are the ones that better describe the text. This approach allows TextRank to build summaries without the need of a training corpus or labeling and allows the use of the algorithm with different languages.

For the task of automated summarization, TextRank models any document as a graph using sentences as nodes . A function to compute the similarity of sentences is needed to build edges in between. This function is used to weight the graph edges, the higher the similarity between sentences the more important the edge between them will be in the graph. In the domain of a Random Walker, as used frequently in PageRank , we can say that we are more likely to go from one sentence to another if they are very similar.

TextRank determines the relation of similarity between two sentences based on the content that both share. This overlap is calculated simply as the number of common lexical tokens between them, divided by the lenght of each to avoid promoting long sentences.

The function featured in the original algorithm can be formalized as:
Definition 1. Given Si , Sj two sentences represented by a set of n words that in Si are represented as Si = wi , wi , …, wi . The similarity function for Si, Sj can be defined as:


The result of this process is a dense graph representing the document. From this graph, PageRank is used to compute the importance of each vertex. The most significative sentences are selected and presented in the same order as they appear in the document as the summary. These ideas are based in changing the way in which distances between sentences are computed to weight the edges of the graph used for PageRank. These similarity measures are orthogonal to the TextRank model, thus they can be easily integrated into the algorithm. We found some of these variations to produce significative improvements over the original algorithm.

BM25 BM25 / Okapi-BM25 is a ranking function widely used as the state of the art for Information Retrieval tasks. BM25 is a variation of the TF-IDF model using a probabilistic model .

Definition 2. Given two sentences R, S, BM25 is defined as:


where k and b are parameters. We used k = 1.2 and b = 0.75. avgDL is the average length of the sentences in our collection.

This function definition implies that if a word appears in more than half the documents of the collection, it will have a negative value. Since this can cause problems in the next stage of the algorithm, we used the following correction formula:


where ε takes a value between 0.5 and 0.30 and avgIDF is the average IDF for all terms. Other corrective strategies were also tested, setting ε = 0 and using simpler modifications of the classic IDF formula.

Evaluation

We tested LCS, Cosine Sim, BM25 and BM25+ as different ways to weight the edges for the TextRank graph. The best results were obtained using BM25 and BM25+ with the corrective formula shown in equation 3. We achieved



Tokenomics.

The distribution of SMR tokens will take place in 3 stages. Private sale for $0.008 per SMR. Join the whitelist to participate in private sales! The public sale will be made after the private sale at a rate of USD 0.01 per SMR. Launch on PancakeSwap, planned after private and public sale. Starting price: $0.012 per SMR. The project allocation is shown in the screenshot.


Token allocation is deployed to the Binance Smart Chain and we are integrating tokens with Summarizer via Web3. Our SMR token has passed the TechRate audit. The source code has also been published and tested by BSCScan.

Summarizer Roadmap

Summarizer aims to make your daily news shorter by utilizing AI.
  • TextRank Implementation — We implement TextRank as the founding block for Summarizer.The basic idea of TextRank is to provide a score for each sentence in a text, then use the top-n sentences to build a summary.
  • Summarizer Design — We want it to be both elegant & fast. Come with dark & light modes, stripped all the unnecessary elements, optimized for speed.
  • Building a bots army — Run by bots, but for the human Summarizer is run by a family of bots. There are crawler-bot, summa-bot, editor-bot, delivery-bot, optimizing-bot, repairing-bot, etc.
  • Beta Summarizer — Let it run by itself. We have run it in beta mode for over a year. It has been processed over 60k articles, all by itself.
  • Integrate with SMR Token — We deploy SMR on Binance Smart Chain and integrate the token with Summarizer via Web3 Summarizer will soon be exclusive to SMR holders.
  • Summarizer on Android — Though the website is pretty fast & sleek, we want to utilize native Android features for Summarizer readers. The app will be available for Android users in just a few days.
  • Summarizer on iOS — We’re building an iOS simultaneously with the Android app iOS users will be fully experienced Summarizer real soon.
  • SMR Landing Page — We build this to offer users information about our token launch in the most transparent way.
  • Marketing Plan for the Launch — To widely spread the word about Summarizer, we’re planned our marketing strategy with various activities across all sectors. From articles on bitcoin talk, building a presence on social networks like Twitter, Reddit, building a Telegram community, organizing airdrop & content rewarding system, running ads on cryptocurrency-related websites, get listed on coinhunt & coinsniper, etc.
  • Audit by TechRate — Our token, SMR, has been audited by TechRate. Its source code has also been published and verified on BSCScan.
  • ● Distribution of SMR Token — The distribution will happen in three phases. Private Sale, at the rate of $0.008 per SMR. Public Sale, which will happen after the Private Sale, at the rate of $0.01 per SMR. Launching on PancakeSwap, happen after the Private & Public Sales. Starting Price: $0.012 per SMR
  • Ongoing Marketing Activities — From articles on bitcoin talk, building a presence on social networks like Twitter, Reddit, building a Telegram community, organizing airdrop & content rewarding system, running ads on cryptocurrency-related websites, get listed on coinhunt & coinsniper, etc.
  • Get tracked by CoinGecko & CoinMarketCap — We’ve fully built SMR token profile on BSCScan and already submitted our applications to CoinGecko and CoinMarketCap
  • Make it exclusive to SMR holders — After the distribution phase, Summarizer content will be exclusive to SMR holders.
  • Exchanges Listing — List SMR on the first few exchanges.
  • Berkeley Model Implementation — This update will improve the summaries’ quality and clarity. The model compresses source document text based on constraints from constituency parses and RST discourse parses. Moreover, it can improve summary clarity by reexpressing pronouns whose antecedents would otherwise be deleted or unclear.
  •  A Customized Summarizer for Every User — Each user on Summarizer will be able to customize their experience and content that appears on the UI. Choose and redesign Summarizer layout & list of categories based on your interest.
  • Bridging SMR to other chains using Anyswap — Using Anyswap, we’ll bridge SMR to Ethereum, Polygon and Harmony blockchains. This will increase Summarizer accessibility to users of other chains. And by making SMR available on multiple chains, it will create more trading pairs & trading volume for SMR on DEXs such as Uniswap and SushiSwap.
  • Open Summarizer technology to other news publishers — We’re planning to open our technology to other news publishers. Using Summarizer, news publishers can effortlessly create summaries for their articles in bulk, deliver those summaries to their readers via newsletters and news feeds.They can even create their own version of Summarizer website with just a few clicks. Profits generated from these activities will be used to buy back & burn SMR.
Summarizer Team
  • Brandon Thomas — Frontend Developer
  • Chris Miller — Blockchain Developer
  • Joy Stewart — Communications Manager
  • Julie Hardin — Marketing Manager
  • Mike Cook — Graphic Designer
  • Robert Hoover — Backend Developer
  • Steve Willis — Software Engineer
Useful links to the project:
Username : pijayplow93
Profile : https://bitcointalk.org/index.php?action=profile;u=2750501
BEP-20 wallet address : 0x476366b6839e2046bE2DCf258476633cAE21195e

No comments:

Post a Comment