How it works in theory

Top  Previous  Next

The goal of the Similarity Calculation in Metropolis Launcher is to calculate for each game B the total similarity score to a chosen game A between 0 and 100.


We have meta data provided in the MobyGames database as well as additionally provided meta data (via Edit Game), which is grouped in feature sets.


Let Fi be a feature set. For example F0 is the publishers name, F1 is the year of release, F2 are genres etc.


Let Fi(X) be the value of the feature set Fi for game X. For example: if game A is "Sonic the Hedgehog (SEGA Genesis)" and F0 is the publishers name then F0(A) = "SEGA of America, Inc."


Fi(X) can yield the following value types:


1.A string of characters (publisher, developer, platform, etc.)
2.A number (rank, score, min. number of players supported, year, etc.)
3.A set with elements (e.g. genres is a set with elements "action", "adventure", "strategy", etc.)


Let Simi(A, B) be the similarity score for Fi(A) and Fi(B).


The calculation of Simi(A, B) depends on the value types of Fi:


1. A string of characters: The similarity is calculated by checking if both strings are entirely equal.


  If Fi(A) = Fi(B) then Simi(A, B) = 100 else Simi(A, B) = 0.


2. A number: Depending on the range of the numbers, the similarity is calculated by calculating the non-negative distance (difference).


  example1: Fi(X) ranges from 0 to 100 then Simi(A, B) = 100 - (Abs(Fi(A) - Fi(B)))

  example2: Fi(X) ranges from 0.0 to 5.0 then Simi(A, B) = 100 - (20 * Abs(Fi(A) - Fi(B)))

  Other forms of calculations may apply to individual feature sets.


3. A set with elements:


  |Fi(A) ∩ Fi(B)| is the number of distinct elements shared by both games A and B

  |Fi(A) U Fi(B)| is the number of distinct elements of games A and B

                      |Fi(A) ∩ Fi(B)|

  Simi(A, B) = --------------------- * 100

                       |Fi(A) U Fi(B)|


   example:                  A = "Shining Force (SEGA Genesis)", Fi(A) = {"Role Playing (RPG)", "Strategy"}

                                  B = "Beyond Oasis (SEGA Genesis)", Fi(B) = {"Action", "Role Playing (RPG)"}

                 |Fi(A) ∩ Fi(B)| = |{"Role Playing (RPG)"}| = 1

                 |Fi(A) U Fi(B)| = |{"Action", "Role Playing (RPG)", "Strategy"}| = 3

                     Simi(A, B) = (1 / 3) * 100 = 33


We want to give the similarity score of some feature sets more weight than others in the total score:


Let Weighti be a factor of 0 ... 10 for each feature set Fi.


With the factor 0 we can have feature sets ignored by the total score.


Finally, for the total similarity score, a feature set should be ignored completely if both Fi(A) and Fi(B) are undefined (that is, neither game A nor game B have values for the feature set Fi).


Let Ci be a factor of 0 if both Fi(A) and Fi(B) are undefined, else 1.


Let Sim(A, B) be the total similarity score of game A and game B:


                   SUM[Ci * Weighti * Simi(A, B)]

Sim(A, B) = --------------------------------------

                           SUM[Ci * Weighti]