This seems like it would be really simple, but I cannot seem to resolve it. Using EF Core, I have a DbSet<Rule> Rules
in my DbContext
.
public class Rule
{
public int Id { get; set; }
public string Raw { get; set; }
}
I am trying to write a query where, given an IEnumerable<string> lines
, give me all of the Rule
s from the DbSet
where its Raw
value is an element in lines
(exact match, not a substring of a value).
For some time, I was using something like:
private IQueryable<Rule> GetExistingRules() =>
dbContext.Rules.Where(r => lines.Contains(r.Raw));
But, I have since discovered that (I think) this was not doing what I was expecting. (This method is immediately followed by inserting new Rule
s for all elements of lines
that do not currently exist. I was getting duplicate Rule
s with the same Raw
value...) I think, instead, I need to use .Intersect()
?
I tried using a custom EqualityComparer per this, but it throws an exception.
private IQueryable<Rule> GetExistingRules()
{
var lineRules = lines.Select(l => new Rule {Raw = l});
return dbContext.Rules.Intersect(lineRules, new RuleRawEqualityComparer());
}
private class RuleRawEqualityComparer : IEqualityComparer<Rule>
{
public bool Equals(Rule x, Rule y) => x?.Raw == y?.Raw;
...
}
Could not parse expression 'value(Microsoft.EntityFrameworkCore.Query.Internal.EntityQueryable`1[FilterLists.Data.Entities.Rule]).Intersect(__p_0, __p_1)': This overload of the method 'System.Linq.Queryable.Intersect' is currently not supported.
What is the best way to compose this query? Note that it is in a chain of DbContext
interactions, so I'd prefer to keep the return type as an IQueryable
to enable EF's lazy query composition.
Update: More info on why I suspected the Contains()
approach was not working:
This is the class where the query is being used. I am seeing exceptions like below because the Raw
column in the database has a unique constraint. I thought my logic (use of Except()
in CreateNewRules()
) would prevent any rows in Rules
with duplicate Raw
values, but maybe my issue lies elsewhere...
public class SnapshotBatch
{
private readonly FilterListsDbContext dbContext;
private readonly IEnumerable<string> lines;
private readonly Data.Entities.Snapshot snapEntity;
public SnapshotBatch(FilterListsDbContext dbContext, IEnumerable<string> lines,
Data.Entities.Snapshot snapEntity)
{
this.dbContext = dbContext;
this.lines = lines;
this.snapEntity = snapEntity;
}
public async Task SaveAsync()
{
var existingRules = GetExistingRules();
var newRules = CreateNewRules(existingRules);
dbContext.Rules.AddRange(newRules);
var rules = existingRules.Concat(newRules);
AddSnapshotRules(rules);
await dbContext.SaveChangesAsync();
}
private IQueryable<Rule> GetExistingRules() =>
dbContext.Rules.Where(r => lines.Contains(r.Raw));
private List<Rule> CreateNewRules(IQueryable<Rule> existingRules) =>
lines.Except(existingRules.Select(r => r.Raw))
.Select(r => new Rule {Raw = r})
.ToList();
private void AddSnapshotRules(IQueryable<Rule> rules) =>
snapEntity.AddedSnapshotRules
.AddRange(rules.Select(r => new SnapshotRule {Rule = r}));
}
Snippet from exception StackTrace (where '###Meebo:AdElement.Root' is a sample value for Raw
in the Rules
table):
FilterLists.Services.Snapshot.Snapshot.TrySaveAsync() in /home/travis/build/collinbarrett/FilterLists/src/FilterLists.Services/Snapshot/Snapshot.cs:line 43 Duplicate entry '###Meebo:AdElement.Root' for key 'IX_rules_Raw' at MySql.Data.MySqlClient.MySqlDataReader.ActivateResultSet(ResultSet resultSet) in C:\projects\mysqlconnector\src\MySqlConnector\MySql.Data.MySqlClient\MySqlDataReader.cs:line 93 at
Update 2: I am fairly sure the issue I was seeing with Contains()
was due to this issue that has an active PR. Because my strings have all kinds of special characters, I think they were not getting escaped properly when passed into Contains()
, but they seem to be properly escaped as parameters in a Join()
.
Do not forget that when you use linQ with EFCore and IQueryable, it translate the c# code in Sql statements.
Did you tried this ?
var query = from rule in dbContext.Rules
join line in lines
on rule.Raw equals line
select rule;
You wrote:
Give me all of the Rules from the DbSet where its Raw value is an element in lines (exact match etc.)
Your first solution will give the desired result:
var requestedRules = dbContext.Rules
.Where(rule => lines.Contains(rule));
In words: from the collection of Rules
, select only those Rules
that have a Raw
value that equals one of the values in the sequence of lines
.
I was getting duplicate Rules with the same Raw value...)
Well apparently your source collection has Rules with the same Raw value!
If you want only unique Raw values you'll have to decide what to do with duplicates:
Id Raw
1 "Hello"
2 "Hello"
Which one do you want? The first? the last? Both? None? Any?
Let's go for Any: we'll make groups of Rules with same Raw value, and from every Group we take the first one (or if you want the last, after all we don't care. But that is a bit less efficient.
var result = dbContext.Rules
.Where(rule => lines.Contains(rule))
.GroupBy(rule => rule.Raw)
.Select(group => group.FirstOrDefault());
In words: from the collection of Rules
, select only those Rules
that have a Raw
value that equals one of the values in the sequence of lines
. From the remaining elements make groups of Rules with same Raw value. Then from every Group take any element, for example the first one.
If you want all / only the first / only the last, you'll know what to do by now.